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APPLICATIONS OF TWO OSCULATORY FORMULAS 
Bt John L. Robbbts 

INTBODUCmON 

The main purpose of this paper is to illustrate how Mr. Jenkins’ osculatory 
formulas' (A) and (B) can be applied in a convenient maimer. The first section 
of this paper will be little more than a summary of some of the formulas con- 
tained in the other three articles. The second section will contain the appli- 
cations. 


1. SOME BfATHEMATICS OF THE FORlfULAS 

The Woolhouse notation will in this paper be used to stand for the differences 
of u^n which represents the given values of a function. The general formulas are 

y* ® yo + a;Ayo + hx{x - l)B + - l)(a: - i)C; (1) 

h)C. (2) 


and 


y, = uo + xoi + Jx(x — 1)B + ix(* — l)(x 
The special formulas belonging to (2) are 

B — b — Id and C = cj — ici , (A) 

where b and d are defined by 6 = i(6o + hi) and by d = ^{do + di); and 

S = h and C = 0. (B) 


The special formulas belon^ig to (1) are 

yo “ Uo + ihot B = b, and C =» 0; (C) 

and 

yo Uo — -iifdo , B ^ b — id, and C - Ci — iei. (D) 

Formula (C) is equivalent to Mr. Jenkins’ formula (A). Also (D) is equivalent 
to his formula (B). 


' This paper presupposes a knowledge of three other articles. The first one by Mr. 
Wilmer A. Jenkins is entitled “Graduation Based on a Modification of Osculatory Inter- 
polation/’ and is printed in the October 1927 issue of the Transactions of the Actuarial 
Society of America. The other two papers are mine. One of them is entitled “Some 
Practical Interpolation Formulas/' and is printed in the September 193S issue of these 
Annals, The other one entitled “A Family of Osculatory Formulas" is printed in the 
October 1985 issue of the Transactions. 
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II. APPLICATIONS OP (c) AND (d) 

First, there is the problem of selecting suitable examples to which (C) and (D) 
can be applied. Secondly, we will then apply in a convenient manner the 
formulas to these examples. 

The problem of selecting suitable examples will now be considered. “The 
non-reproducing characteristic of“ formula (D) “raises the question of what 
will happen in the graduation of a series whose fourth differences are all posi- 
tive, say. The answer is that the graduated series will lie everywhere below 
the observed points and that the observations will not be correotly represented 
by the interpolated series." On the other hand, if we select a series whose 
fourth differences change frequently in sign, (D) because of its non-reproducing 
characteristic has valuable smoothing possibilities. In like manner, (C) may 
be valuable when the second differences change frequently in sign. Mr. Jenkins 
gives at quinquennial ages rates of mortality which were graphically determined 
from the published American Men Ultimate Experience. Since the fourth 
differences of these rates change frequently in sign, we will apply (D) to a few 
of these rates. So far as I know no suitable actuarial examples have been 
found to which (C) can be applied. However, there is the possibility that (C) 
might be valuable in some sciences. Since I do not know of any suitable real 
example to which (C) can be applied, we will apply it to a trivial series whose 
second differences change frequently in sign. 

We are now ready to apply in a convenient manner (C) and (D) to the 
examples selected in the preceding paragraph. 

First, we will apply (C). I have in my other article applied (B) in a con- 
venient manner. This method with little change can be applied to (C), If 
it is desired to apply (C) at either end of the table where values of w* are not 
available for the calculation of the second differences, it can be assumed they 
vanish. It is convenient if S and S* represent respectively the major differ- 
ences AUg and A^Ug in such a manner that they are arranged centrally in the 
working illustration. It is convenient if 8 and s* represent respectively the 
minor differences and . The quantity yo can be computed by yo = 
Wo + i&o , and yi can be computed in like manner. Since we wish in the working 
illustration of (C) to interpolate four values between yo and yi , the middle 
s ac 5 y .4 .2Ayo , Biid = MB = .02(bo + 6i). We can by the use of the 
foregoing method apply (C) to suitable functions, whose given values can be 
represented by /(r). Then, it follows from the definition of u, that /(r) ~ w, . 
It might prevent confusion if it is stated that x and r are related to each other 
in such a way that we always interpolate between yo and yi . We shall now 
apply (C) to the case when /(r) represents the trivial series shown at top of 
page 3. 

Finally, we will apply (D). Mr. Henderson has applied (A) in a very con- 
venient manner. His method with little change can be applied to (D). If it 
is desired to apply (D) at either end of the table where values of w, are not 
available for the calculation of the differences required, it can be assumed 
that the fourth differences that can not be computed vanish, and the required 
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differences can be filled in consistently with that assumption. It is convenient 
if S, and represent respectively the major differences Aw, , , and 

A*w, in such a manner that they are arranged centrally in the working illustra- 
tion. It is convenient if «, and s* represent the minor differences so that 
by definition a = 8 , = gy, , = s* = 6 Vr~. 2 , and s* == g’j/, . The first 
a* = 5V-.a = .04(6o — ido). The last a* =* d^y.s = .04(6i — Jdi). The quan- 
tity j/o can be computed by yo = t^o *- -^do , and yi can be computed in like 
manner. The middle a «= dy ,4 = .2Ayo — We are now in position to 
apply (D) to the quinquennial rates of mortality. 


f 


Age 

Rate 

s 


s* 

8* 

72 

.07010 

.03808 




77 

.10818 

.04669 

.00861 

.01799 


82 

.15487 

.07329 

.02660 

- .01946 

- .03745 

87 

.22816 

.08043 

.00714 

.12572 

.14518 

92 

.30859 

.21329 

.13286 

12572 

.00000 

97 

.52188 


.25858 


.00000 
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Age 

Vx 

8 



82 

.15591 

12612 

.001314 


83 

.168522 

13527 

915 


84 

.182049 

.014043 

516 

- .000399 

85 

.196092 

14160 

117 


86 

.210252 

13878 

- .000282 


87 

.22413 

13460 

-.000682 


88 

.237590 

13977 

.000517 


89 

.251567 

.015693 

1716 

.001199 

90 

.267260 

18608 

2915 


91 

.285868 

22722 

4114 


92 

.30859 

28006 

.005314 


93 

.336596 

34326 

6320 


94 

.370922 

.041652 

7326 

.001006 

95 

.412574 

49984 

8332 


96 

.462558 

59322 

9338 


97 

.52188 


.010343 





SOME SIMPLE DEVELOPMENTS IN THE USE OF THE 
COEFFICIENT OF STABILITY 


By C. H. Forsyth 

Some time ago the writer proposed' a coeflBcient of stability C» to be used 
to measure the stability of a statistical series, where that coefficient is defined 
by the relation 



where M denotes the arithmetic mean and cr* the square of the dispersion of 
the terms of the series. It was proposed to regard scries as unstable (Lexian) 
for which the value of the coefficient exceeded unity, and stable otherwise. 
The only essential way in which such a procedure differs in results from the 
traditional method is that it includes as stable those series for which the value 
of the coefficient lies between unity and q the probability of failure of the event 
under investigation — series which would be classed as unstable according to 
the traditional method. Stable series — according to either standard — are found 
so rarely in practice and therefore so many series arc accepted as fairly stable 
which come anywhere near meeting the requirements that replacing q by unity 
as the line of demarcation affects the classification of no known series but 
adds to the effectiveness of the avowed purpose and use of the proposed coeffi- 
cient — to avoid the round-about work of computing values of probabilities. 
Another merit of the use of the coefficient is that it enables one to measure 
and therefore compare the stability of several series — a feature which wc shall 
illustrate later. 

In brief, such a coefficient provides a means of introducing the whole Lexian 
theory into Federal publications such as those on vital statistics, since a com- 
parison of the values of the coefficient for, say different communities or countries, 
would be readily grasped by any reader, whereas the traditional method would 
prove too subtle and laborious, and allow no ready comparison of results. 

For purpose of orientation let us illustrate the situation by analyzing a simple 
series both ways — the traditional way and by the use of the coefficient of sta- 
bility. As an example, let us consider the death rates of white infants under 
one year of age for 1919 (considered on page 89 of the Handbook) for those 
states whose frequencies of births are comparable or which vary little from 

1 Journal of tho American Statistical Association, June, 1932. 
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their average of 47,830 — where the number of deaths for 

each state has been 

adjusted to this average as a base. 




Adjusted 

Deaths X 

X- 3659 

(X - 3669)* 

Cal 

3350 

-309 

95481 

Conn 

4700 

1041 

1083681 

Ind 

3732 

73 

5329 

Kan 

3253 

-406 

164836 

Ky 

3686 

27 

729 

Minn 

3159 

-500 

250000 

N. Car 

3541 

-118 

13924 

Va 

3732 

73 

5329 

Wis 

3780 

121 

14641 


9)32933 

1335^1333 

) 1633950 


M = 3659 


181550 - 




<r » 426 

The traditional method would be: 



The mean M = 

np 3= 3669 where n = 

47,830. 



„ 3659 

” - 47830 

, 44171 

“''’ = 47830 



and (7*a -npg - 3659 - 3378 


whence ob — 58.15 


which is the value of the dispersion we should expect if the basic probability 
were constant throughout. But the value of the dispersion proves to be 
«r = vTsTsBo = 426, and the comparison of the values shows that the basic 
probability to be very variable and therefore the series to be very unstable or 
Lexian. 

The computation of the value of the coefficient of stability is much more 
simple and direct 


^ c 181550 
’ ~ M~ 3659 


49.6 


whose excess over unity also clearly indicates the instability of the series. 

Since proposing the coefficient of stability the writer has been impressed by 
the overwhelming proportion of existing series (such as birth rates, various kinds 
of death rates, etc.) which employ arbitrary bases (such as “per thousand,” 
“per ten thousand,” etc.) usually without mention of the actual base. It is 
obvious, of course, that such rates, or occurrences per arbitrary base, say b, 
can first be adjusted to give occurrences per actual base, say B (assuming that 
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base B* can be determined) but the work can evidently be performed much 
easier. For, since the original series (per arbitrary base 6) Xi, Xt, ■ • • Xm 


B B 

would become, on adjustment, -rXi, -rXt, 

0 0 

Jg 

Y* Af aiid the square of the dispersion 
0 

cient of stability would become 


ion (I .) , 


jg 

• • tXh, the mean would become 
b 

whence the formula for the coeffi- 


C. 


1 . i. 

M' b 


( 2 ) 


As an example, let us consider the general death rates, per 10,000, of New 


Zealand for the years 

1921-30. 

X 

X - 86 

(X - 86)« 

1921 

87 

1 

1 

1922 

88 

2 

4 

1923 

90 

4 

‘ 16 

1924 

83 

-3 

9 

1925 

83 

-3 

9 

1926 

87 

1 

1 

1927 

85 

-1 

1 

1928 

85 

-1 

1 

1929 

88 

2 

4 

1930 

86 

0 

0 


10)862 

10-8 

)46 


M = 86.2 


4.6 


This example illustrates the danger of asing the coefficient of stability unless 
the series coasists of actual occurrences or unless the actual base is given due 
consideration. Without due consideration of the actual base (here the popula- 
tion of New Zealand) one might easily fall into the error of regarding the value 
of the coefficient of stability as 4.6/86.2 and, therefore, the series as very 
stable. But the population of New Zealand is about a million and a half and, 
therefore the true value of the coefficient of stability is 


4.6 1,500,000 
86.2 lO'OOO 


= 8.0 


Strictly speaking, this actual base B should be constant throughout the series; other^ 
wise the successive numbers of occurrences — the terms of the series — would not be com- 
parable. Where, however, the base B varies little from term to term — as usually happens 
even in the best of series, such as a scries of some kind of rates of the same community 
over a short interval — the variation can be ignored, in which case base B (to which the 
terms of the series are adjusted) usually means the arithmetic mean of the different bases. 
In the first treated above, the investigation was limited to certain states in an effort to 
comply with the rule just mentioned but the example is a poor one since the variations 
are still dangerously too large. The situation is saved by the conclusive results. 



8 c, Vi rmarm 

«tieh bIiows the seriea to be unstable. However, before we condemn Kew 
Zealand’s death rates too sever^, let us compare her record with those of 
other important countries, including our own, for the same period. 


General Death Rates (per 10,000) 



M 

c. 

New Zealand 

86.2 

8 

Australia 

94.3 

90 

Sweden 

120.4 

96 

Scotland 

137.3 

139 

Austria 

151.1 

536 

United States 

118.0 

830 

England-Wales 

121.3 

1117 

France 

170.3 

1129 

Spain 

193.7 

2190 

Italy 

163.5 

2760 

Germany 

125.4 

6040 

Japan. 

206.4 

6800 


These results show how extremely unstable most series of general death 
rates are and that the series for New Zealand, while unstable according to 
our strict criterion, enjoys quite an enviable position practically in a class by 
itself. Parenthetically, these results also illustrate fairly well the triviality, 
with respect to results, of replacing q by unity as the critical value of the coeffi- 
cient of stability, discussed at the beginning of this article. 

The values of the coefficient listed above would, of course, be reduced some- 
what in most cases if the trend of the series were first eliminated but the writer 
has gone though all this work and found it not worth while — that is, the series 
would still remain markedly unstable. 

Another development proves useful when, as frequently happens, the actual 
base B is imknown to a degree of accuracy desirable for use in formula (2). 

From the inequality 

M 0 

we obtain 



(3) 


which is to be used to show how small an actual base should be for the given 
series to be stable. As an example, let us consider the maternal mortality, 
per 10,000 live births, in the so-called expanding registration area of the United 
States. 
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Maternal D^ths in 

the United States (per 10,000 live births) (Ehcpanding 


Registration Area) 



X 

Z-06 

(X-W)* 

1923 

67 . 

1 

1 

1924 

66 

0 

0 

1925 

65 

-1 

1 

1926 

66 

0 

0 

1927 

65 

-1 

1 

1928 

69 

3 

9 

1929 

70 

4 

16 

1930 

67 

1 

1 

1931 

66 

0 

0 

1932 

64 

-2 

4 


10)665 

9-4 

)33 


66.5 


3.3 

Hence, by formula (3), B ^ 

(10,000) or about 200,000. The number 


of live births varies so greatly that we should probably find it impossible to 
agree upon a satisfactory number* to use as an actual base for such an “ex- 
panding area’’ but we should all agree that it would be so much greater than 

200.000 that the instability of the series would be unquestioned. 

One must be careful in comparing the results of two or more investigations 
like the one just conducted. For example, the analogous result for Canada 
for the same period yields B g 113,000 and we might conclude, too hastily, 
that the United States series is more stable (or less unstable) whereas any 
knowledge whatever of the numbers of live births of the two countries would 
show that Canada comes much closer to fulfilling her requirement than the 
United States and that the palm must go to Canada. For one thing, Canada 
has about the population of New York city and New York city has about 

100.000 live births annually. In any case, close decisions in matters of this 
kind would be difficult without sufficient information in regard to actual bases. 

There is still another situation which is interesting but of much less impor- 
tance because of the rarity of its occurrence. It will be recalled that the coeffi- 
cient of stability was devised mainly to avoid the use and computation of 
probabilities and that the only difference between the results by the traditional 
method and by the use of the coefficient bf stability lies in the trivial replace- 
ment of the critical value q by unity. In the traditional method of analysis, 
but by comparing the value of the coefficient of stability with g, the coefficient 
is evidently always, strictly speaking, a function of the actual base B. In 
other words, there is no statistical series, however stable it may seem — except 


* It was in the neighborhood of two million in 1932. 
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for the trivial case when i^l the terms of ibhe series are exactly the same-^ttt 
v/hat woiUd be unstable if the base were small enough. It is possible to formu- 
late the limit once for all below which the given (otherwise seemingly stable) 
series would prove unstable. 

If, in the relation 9 ^ njjg (for stability) we replace p by Min, g by 1 — Mjn 
and then n by B, ve obtain 

^ Af =- or ^ M — a 

jD X> 


whence, finally 


B ^ 


Af* 

Af-<r» 


(4) 


where the transference of the term M — a from one side to the other should 
cause no apprehension since, by hypothesis, < M and Af — a* is therefore 
always positive. We propose to employ formula (4) in those rare cases where 
the value of the coefficient of stability of actual occurrences — ^but without 
reference to an actual base — ^is less than unity — ^that is, where the given series 
proves to be stable according to the method proposed by the writer — ^and 
determine the upper limit of the values of the base B for which the series would 
be unstable according to the traditional method of analysis. As an illustra- 
tion, let us consider the familiar series of annual football fatalities in this country 
for the period 1906-1930* (omitting the years when no records were kept). 


Football Fatalities 


1906 

11 

1917 

12 

1907 

11 

1921 

12 

1908 

13 

1923 

18 

1909 

12 

1925 

20 

1911 

11 

1926 

9 

1912 

13 

1927 

17 

1913 

5 

1928 

18 

1914 

13 

1929 

12 

1915 

15 

1930 

13 


It is easily verified that C, =* -o ~ ; TF g which is clearly less than unity; whence 

lo.Ooo 

the series clearly seems stable. Applying formula (4) 


P 13.055* 

= 13.055 - 11.942 


or 153 


which shows that the given series is stable as long as the total number of foot- 
ball players exceeds the number 153. A recent news item quoted an estimate 
of the number players participating in games of four hundred colleges as about 
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13,000 and over 600,000 including high schools and all. We can then definitely 
say that the series just conridered is stable. Such a conclusion has no bearing, 
of course, upon what might happen if other terms were added to the series. 
It happens that adding the records for the next five years— 1931(33), 1932(32), 
1933(27), 1934(25), 1935(30)— would change the whole series to an unstable 
one with C, * 56.9/16.6 = 3.4; but, obviously, the additional records belong 
to a new regime of collection. 



INTERNAL AND EXTERNAL MEANS ARISING FROM THE SCALING 
OF FREQUENCY FUNCTIONS 

By Edward L. Dodd 

The scaling^ of frequency functions has been discussed from the standpoint 
of maximum likelihood. But the likelihood criterion to be satisfied sometimes 
leads to a minimum likelihood; and sometimes to neither a maximum nor a 
minimum. Scaling will be studied in this paper with reference to the likelihood 
actually secured, and also with reference to the character of means obtained, 
whether internal or external. 


SECTION 1. INTRODUCTION 


It is well known that a scale obtained in a curve-fitting process is sometimes 
a mean. Thus, with the normal function 


( 1 ) 


1 ^-(x/ 0)2/2 

aV^ 


if the scale a is to be obtained from measurements, ari, • • • , a:„, we com- 
monly accept the value 

( 2 ) 

that is, the root-mean square of the measurements. Here, the positive value 
of a is naturally taken. It is called the standard deviation, and thought of as 
an appropriate new unit of measure. 

But even with the all negative, and the a taken positive, O. Chisini^ con- 
sidered it proper to regard a as a mean of the x% albeit an external mean. 
From Chisini's viewpoint, this a whether regarded as positive or negative is 
primarily a solution of 


(3) -f* ^2 + • • • + + • • • + 


In this sum of squares, the single number a may be substittded for each of the 
x^s. Perhaps this kind of mean should be called a substitutive mean to dis- 
tinguish it from the means of general analysis which are always internal. 


^Fisher, R. A., the mathematical foundation of theoretical statistics/' Philo- 
sophical Transactions of the Royal Society of London, Series A, Vol. 222, 309-368, (1921). 
See p. 338. 

* Chisini, O., *‘Sul concetto di media," Periodico di matematico, Series 4, Vol. 9, 106-116, 
(1929). 
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The normal function is a particular case of a more general function: 

(4) Constant = —f Ip, t — x/a. 

The likelihood method to find the scale a for this function leads to power means, 
including the arithmetic mean, the root-mean-square, root-mean-cube, etc., for 
p = 1, 2, 3, etc. 

The word scale will be used only for a positive number, — ^which then may be 
regarded as a unit of measurement. 

For measurements, xi, X 2 , • • * ,Xn Chisini regarded M as a mean, relative 
to a function G, provided 

(5) G(xi, X2, . . . , Xn) - G(M, M, . . . , M). 

If a solution of this equation is 

(6) M = F(xi, Xi, • • • , Xn), 

and c is a possible value for the x^8, it follows at once that 

(7) F(c, c, • * • , c) = c, 

or at least one value of this F is c. Conversely, if (7) is satisfied, it is but a 
change of notation to replace c in (7) by M, and to combine this with (6) to 
obtain 

(8) F(xi,X 2, ... ,x«) = F(ilf, M, . . . , M). 

Hence, this F which in (6) gives explicit form to the implicit M found in (5) 
may also be thought of as a mean-forming function, such as G in (5). Briefly, 
F is a particular (?. Thus F(xi, X 2 , — * Xn) is a mean of Xi, X 2 , • • • , Xn, if F 
is so constructed that (7) is satisfied when the arguments are all equal. 
Inasmuch as a frequency function f{t) is non-negative, log, f{t) is real, — say 
plus constant. Following R. A. Fisher, it will be convenient to write 

(9) /(O = CaT^ C = Constant 

With location m already determined, the x’s will be thought of as measured, 
from m. And we set 

(10) t = x/a, ti = Xi/a, t = 1, 2, . . , n. 

The “productive” probability — to yield Xi, X 2 , ••• , Xn — is then 

(11) L = n/(<.) = 

This is proportional* to the “likelihood” of o. Also — it may be noted in 
passing — the productive probability is also proportional to the a posteriori 
probability, if a constant a priori probability is postulated. The likelihood 
will here be taken as Uf(ti) itself ; and it will be designated by L, — ^in Fisher's 


* Loc. Cit., Fisher, p. 310. 
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notatikm, L « log K. Of course, 11 and log 11 take maxiinuin values simid- 
taneously, if at all. From (11) it follows^ that 

(12) -o-a log L/da * n + 2!<<^'(f<) - + 1). 

The equation 

(13) SW'(<i) + n « 0 (* = 1, 2, . • • , n) 

will be called the likelihood condition, whether this leads to maximum likeli- 
hood, to minimum likelihood, or to neither. A second differention‘ leads to 

(14) o*. a* log L/da' = ItW'iti) - n = 2 {<?«"«<) - 1 j . 

When negative, this indicates a maximum likelihood; when positive, a minimiun 
likelihood for the a obtained from (13). 

Preparatory to the theorems of the next section, just one more matter will 
be discussed. The unit for t is arbitrary; and it may be convenient to write, 
with fc 5 ^ 0, 

(15) ^(0 = <l>(ku) = 4>(«), t ^ ku. 

Then 

(16) Ul>'(t) - ui>'(u). 

Suppose, now, that a positive constant k can be found such that k<(t'(k) «= — 1. 
Tlien, with t = ku, aa postulated, 

(17) 1.4>'(1) = fc«'(*) = -1- 

Thus <&'(!) “ ~1> — or as it will now be written ^'(1) = ~li — is no more 
restrictive than the condition that some positive k exists such that k<l>'{k) = — 1 . 

SECTION 2. OENERAL THEOREMS CONCERNING THE SCALE AS A MEAN 

Theorem I 

Given the frequency function 

(18) fit) s= Ca~^ t = x/a, U = ®</a, C = Constant. 

And suppose that 

(19) ^'(1) = -1. 

Suppose, also, that for given Xi, X 2 , , *», the likelihood condition (13), 

now written 

(20) 2" ixi/a)<t>'ixi/a) -|- n = 0, 

* Loc. Cit., Fisher, p. 338. 

* Loc. Cit., Fisher, p. 339. 
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has a positive solution. 

(21) a « F(xi, Xf, • . • , Xn). 

Then this a, the scalci is a mean. 

Proof. With each x< « 0, (20) cannot be satisfied. 

But if, with c 0, we take each Xi — c, and at the same time set a » c, 
then, by (19), S *=* — n; and thus (20) which gives a implicitly is satisfied. 
The explicit a in (21) is therefore such a function F that (7) is satisfied. Hence, 
the scale a is a mean. 

Theorem II 

Given the frequency function 

(18) f(t) == t = x/a, U = Xj/o, C == Constant. 

Suppose that 

(19) «'(1) - - 1, 
and that 

(22) 1^'(0|<1 if 10 <1. 

Moreover, suppose that the likelihood condition (20) for measurements 
Xi, xj, • • • , Xn, has a positive solution a. Then 

(23) a S Maximum 1 x,* | . 

Or, suppose that, in place of (22), we have 

(24) 1«>'(01>1 if 10 >1; 

and that keeps the same sign, if ] 0 > 1- Then 

(25) Minimum 1 x* | S a. 

Proof. Suppose, if possible, that a > Max 1 x< j. Then each | x,/a 1 < 1, 
and by (22), 1 (x,/a)<^'(x,/a) | < 1. Then (20) is not satisfied, since | 2 | < n. 
Thus the hypothesis is contradicted. 

Now (26) is satisfied at once if any x< = 0. But suppose, on the other hand, 
that Min | x,- [ > 0; and, if possible, that a < Min | x< j- Then, by (24) et 
seq., since \ Xi/a | > 1, it follows that | 2 | > n. And thus (20) is again con- 
tradicted. 

Theorem III 

Given the frequency function 

(18) f(t) = CoT^ t = x/a, ti = x,/a, C = Constant; 

and set ^(t) = Ul>'{t) + 1. Suppose that 

(26) lim ^(0 * a, lim ^(0 == afi < 0. 

1-0 1M-* 
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And suppose that ^(<) is continuous when t ^ 0. 

Then, for any set of real numbers, xi, Xs, • • ■ , x., of which none is zero, 
there exists a positive number a, as scale, such that the likelihood condition 

(20) J^i(xi/a)<l>'(xi/a) + n =* 0 

is satisfied. 

The conclusion is also valid, if in place of the limit 0, there is postulated 

(27) lim ^(t) = — a I « I = lim 

t-*— 5+0 

where 6 > 0, c > 0, and ^(0 is continuous for — 6 < ( < 0 and for 0 < ^ < c. 
That is, the new limits are to be infinite with sign opposite to that of a. 

Proof, The limits for < 0 and for 1 i <» are the same as the limits 

for a 00 and a — ► 0+, — noting that t — x/a, x 9 ^ 0, Thus changes 

sign as a goes from 0+ to « , Hence, since ^(0 is continuous, (20) is satisfied 
for some positive a. 

For the proof of the second part of the theorem, suppose that Xn > 0 and 
that Xn is the greatest Xi . Then with a > Xn/c^ but approaching Xn/c, ^(xn/a) 
becomes infinite with sign opposite to that of a. Furthermore, in X{f/(xi/a)f 
the positive x's < Xn have a negligible effect; and thus lim S^(x,/a), as 
a — > (xn/c) + 0, is infinite with sign opposite to that of a, when this sum S 
is taken for the positive x’s. Likewise, if Xi < 0, and is the least x, , lim S^(x»7a), 
as a {—Xi/b) + 0, is infinite with sign opposite to that of a, when this sum 
is taken for the negative x's. If, now, the measurements happen to be all 
positive, we think of a as approaching Xn/c + 0; and the continuity condition 
leads to an a which makes 2^(x,/a) = 0. Likewise, if the measurements 
happen to be all negative, we use —Xi/b + 0. If both positive and negative 
x*8 appear, we use the greater of the two ratios — Xi/6 and x„/c. 

SECTION 3. SOME FAIRLY REGULAR FREQUENCY FUNCTIONS 

To illustrate the foregoing theorems in a somewhat general manner, consider 
the measurements, Xi, X2, • • • , Xn, and with i = xf a, U = x,/o, set up the 
function: 

(28) /(<) = Co"‘ 1 U I” (1 + A;V)“' e"' ' 

where, as before, C is a suitably chosen constant. 

Suppose also that 

(29) p > — 1, 3^0, r ^ 0, s S 0; 

and that either 

(30) r > 0, 8 > 0 or r = 0, 23 > p + 1. 

Then with ^(/) = log/(<), it follows that, when i 0, 

(31) . ««'(<) + 1 = (p + 1) - rsfc* 1 1 r - 23fc¥(l + kH*y\ 
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Now the condition l'^'(l) «» — 1 would be satis&d if ^(k) = 0, where 

(32) -^ik) = + r«fc* + (2g - p - l)fc* - (p + 1). 

But, under the conditions (29) and (30) 'I'(O) < 0, and ^(oo) > 0. Hence, 
there is a positive k for which 'if(k) «= 0. Then if A: be assigned this value, 

(19) is satisfied; and by Theorem 1, any scale a that the likelihood condition 

(20) may lead to is a mean. But, by Theorem III a scale a will actually exist 
— ^indeed, for any positive k that may be used in (29) ; since the limit of <^'(0 + 1 
is positive as < — » 0, and is negative as 1 < | -+ « . 

Moreover, if in (29), the further condition — 1 < p ^ 0 is introduced, (22) is 
satisfied. And, thus, a ^ Maximum | x. |. Also, | | increases with 1 1 1. 

Hence, by (24) et seq.. Minimum | X{ | ^ a. 

If in (28), we set g = 0, « = 1, r > 0, and confine our attention to positive 
X and t, there is obtained the Pearson Type III. Reference to (32) shows that 
"itik) =» 0 if k = (p + l)/r. With this substitution, 

(33) /«) = C‘ o'* t” C' = Constant. 

Since ^'(1) = —1, any solution of the likelihood condition is a mean. Here, 
with < > 0, <0'(O = p — (p -i- l)t, and <V'(0 — 1 = — (p + 1). From (14) 
we see that, with p + 1 > 0, any mean obtained corresponds to maximiun likeli* 
hood and the single maximum found is actually the largest value. Moreover, 
with the measurements, Xi, x*, • • • , x„, all positive, a scale o will exist, — as 
noted in the general case (28). 

In passing, it may be noted that Type III appears* rather naturally in a 
form giving ^'(1) = — 1 at once, without any transformation. Here, then, a 
scale is a mean. 

Given the Pearson Type I in the form 

(34) /(<) = Co'*(6 + ktyic — kty, t = x/a, b > 0, c > 0, | pg | > 0. 

If p + g + 1 > 0, it is possible to find a positive, k so that with ^ == log /, 
^'(1) = — 1. In this case, any scale found by the likelihood condition is a 
mean. With k thus chosen, f{t) has essentially the same form as it would have 
if fc = 1. Hence for convenience, let us simply set k = 1 in the above equation. 
Then for ~b < t < c, 

m = <^'(0 + 1 = 1 + p«(b + <)"‘ - qt{c - t)-\ 

Suppose now that p > 0 and g > 0. Then Theorem III maybe applied; since 
lim 4'it) — 1, as t — ♦ 0; but lim iA(0 —* — » , as < — b + 0, oras<— »c — 0. 

' Carver, H. C., Handbook of Mathematical Statistics, Chap. VII, see p. 105, Line 4, 
noting that 4' y'/v- 
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Hence a scale a satisfying the likelihood condition ensts. Moreover, the likeli* 
hood is at a maxinium; since, with —b<t<c, 

<V'(0 -■ 1 = — p<*(6 + <)~* ~ — 1 < 0. 

This maxiinum is also the largest value for all values of a. 

If the Pearson Type IV is given in the form 

(36) /(t) = Co“'(l 4- t^x/a 

then if p > 1/2, it is possible to find a positive k which will make ^'(1) ■= — 1. 
In this case, any scale o is a mean. Moreover — for any k ^ 0 — the limit of 
+ 1 is 1 for f — » 0 and is 1 — 2p for < — » « . Hence, by Theorem III, 
if p > 1/2, as above, then a scale a exists satisfjdng the likelihood condition (20). 


SECTION 4. FREQUENCY FUNCTIONS WITH CERTAIN PECULIARITIES 

The theorems of section 2 give suflScient conditions, which in some cases 
may not be necessary. Nevertheless, by violating certain hypotheses, particu- 
lar functions may be set up which exhibit various peculiarities. 

For the Pearson Types, the differential equation is 


(36) 


y(o 


t = x/a. 


The determination of a positive scale o by the Fisher likelihood process is 
impossible here, in case Oo = 0, Oi > 0, ho + hit + h*t* > 0. For in this case 
<^'(<) S 0; and thus (20) cannot be satisfied. The U-shaped Type II curves 
are in this class. Likewise, if ao 0, oi = 0, and ho + hit -f- hjt* > 0, — for 
example, with h* > 0, h? < 4hoh8 , — and the measurements all happen to have 
the same sign as Ug , such scaling is impossible. 

For the purpose of constructing peculiar functions we may take c > 0 and 
require that the measurements Zf be either — c or c — with at least one — c and at 
least one c — and that ^(t) be an even function. Then ^(— c) = ^(c) and (11) 
becomes 


(37) 


L = [Ca-‘ 


The likelihood condition (13) reduces to 


(38) 0 = ^(t) = t<^'(t) + 1 = (c/a)0'(c/o) + 1, 


with the right member an even function of c/a. And from (14), a maximum 
likelihood is indicated when 


(39) (c/c)V'(c/o) - 1 < 0, 

with the left member likewise an even function. A minimum likelihood is 
indicated if the left member is positive. 
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Let us app^y this to the case where 

(40) = (-2/3) log (1 - 3 I f i); 1«'(0 = 2 | < | (1 - 3 | < l)"‘. 

The likelihood condition (38) is satisfied only when t = d=l. Also ^'(1) * — 1. 
Thus the only means are the internal means ±c; and the only scale conformable 
to (38) is o “ c. But this has minimum likelihood; since 1 -^"(1) — 1 *» j > 0. 
For positive t, this function (40) is a Pearson Type. 

Ck>n6ider next a function of the form (28), — with p = —1.25, q = —0.6, 
however, — for which (31) becomes 

(41) W{t) + 1 = -1/4 - <74 + <7(1 +<*)=-(!- <*)74(1 + <*). 

whence ^'(1) = ~1> ^^^(1) = +1, ^'"(1) = —3. Here the likelihood condi- 
tion (38) has but a single absolute solution | < | = 1, leading to the single scale 
0 = 0 , and to the two internal means, ±c. But, in this case 1 -^"(1) — 1 = 0, 
so that a* log L/aa* = 0. Moreover, for < = 1, a’ log L/ao* = o~’ ^ 0. Thus, 
the only scale obtained by the likelihood method (38) — viz., a = c— has a 
likelihood which is neither at a maximum nor at a minimum. 

Another anomalous function is that given by 

(42) <<>(<) = t* - 2.5<*, < = d=c/o. 

The likelihood condition (38) leads to 

m = (1 - <')(! - 4<*) = 0. 

The only solutions are < = dtl, giving internal means drc; and < = ±1/2, giving 
external means ±2c. And from (39) et seq., it can be shown that the internal 
mean and scale, a = c has minimum likelihood, while the external mean and 
scale, o = 2c, has maximum likelihood. 

But it will be noted that a maximum value for a vicinity does not always 
signify a largest value for the entire possible range. Indeed, for the function 

(42) , a = 2c has maximum likelihood without having the largest likelihood. 
To avoid such an anomaly, a necessary condition is that as | < | —*■ <» , 
^(<) — > — 00 ; as seen by taking the logarithm of L in. (37), noting that as o 0, 

( — log o) — ♦ + 00. 

Finally avoiding the anomaly just mentioned, let us set up a frequency 
function, using the ^(<) in (38), and writing 

m = 1 + <«'(<) = (1 - 2<')(1 - <*)(! - 0.9<*). 

From this it follows readily that 

(43) 0(<) = K - 1.96<* 4- 1.175<‘ - 0.3<*, K = Constant. 

This, with U = =tc/a, leads to an internal mean or scale a = c with minimum 
likelihood, a nearby scale a = c \/0.9 with maximum likelihood — differing 
indeed only slightly from the minimum just mentioned — and another scale 
a = c\/2 having maximum likelihood, and this likelihood is indeed greater 
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than that for any other positive value of a. The external mean a » 0^/2 
in this case has the largest likelihood. This may be checked by the use of 
the logarithm of Z/ as it appears in (37), in which the important part is 
^(c/c) - log 0 . 

In passing it may be noted that if has the form ^(0 «= (1 - with 
£r(l} w, and U ^ Xi/a) then aUy solution a of the likelihood condition 
■= 0 is a mean,— by Theorem I. 

SECTION 5 . SVMMABT 

When the R. A. Fisher likelihood method is used to find an “optimum” scale 
for frequency functions, it sometimes happens that this scale is a well known 
mean or at least is a svbstiiutive mean — See Equation (5). Or a simple trans- 
fonnation (15) may often put the frequency function into such a form. Con- 
ditions are given under which a scale will be a mean. Under further condi- 
tions this mean will be internal — at least as regards absolute values. Finally, 
under certain conditions, a scale will exist. 

But for certain functions not satisfying these conditions, anomalies appear. 
The scale given by the usual likelihood condition may be a scale with a minimum 
likelihood. Sometimes the likelihood will be at neither a maximum nor a 
minimum. In certain 'simple cases, no scale exists. Furthennore, it may 
happen that the scales which are internal means have minimum likelihood and 
those that are external means have maximum likelihood. Among Pearson 
Types are found both anomalous functions and functions which would be 
regarded as regular as regards maximum likelihood. 

In this problem of scaling, likelihood is proportional to a posteriori probability 
with the a priori probability taken as constant. 



MOMENTS OF ANY RATIONAL INTEGRAL ISOBARIC SAMPLE 
MOMENT FUNCTION 

By Paul S. Dwyer 

Introduction 

The problem of moments of moments has been investigated by a number of 
authors. The assumption of an inlBlnite universe (or that of a finite universe 
with replacements) permits the application of the ‘'algebraic^' method, the 
method of semi-invariants as introduced by Thiele (1) and developed by C. C. 
Craig (2) and the combinatorial analysis method introduced by R. A. Fisher (3) 
and used by N. St. Georgescu (4). A combinatorial analysis method has the 
particular advantage that it enables one to compute separate terms of a given 
formula. 

The formulae for moments of moments have been simplified through the 
use of new moment functions. Thiele introduced the half-invariant (1) which 
resulted in considerable condensation. More recently Prof. R. A. Fisher (3) 
has introduced the sample function k whose expected value is a half invariant. 
The most compact formulization presented thus far is his formulation of the 
half invariants of the sample fcr in terms of the half invariants of the universe. 
This very compactness, however, makes it diflScult to compare results with 
those expressed in the more conventional sample functions. Dr. Wishart has 
written a paper (7) in which he shows, among other things, how the Fisher results 
can be translated to the more conventional (Craig) results and vice versa, but 
such translation is in general no simple matter. It appears that the Fisher 
results are not immediately useful to the statistician who desires the formulae 
to be expressed in terms of the usual sample moment function. On the other 
hand the Fisher formulization is a remarkable discovery toward that harmony 
which must be naturally inherent in the field of moments of moments. Soper 
(6, 111) expressed the general situation when he wrote, “If the terrif 3 dng over- 
growth of algebraic formulation accompanying this branch of statistical inquiry 
is destined to have a chief utility in induction and going back to causes, then 
perhaps Dr. Fisher’s way of estimating a sample will prove to be most fertile, 
but if it is to be applied to problems of deduction, say to problems of suc- 
cessive eventuation such as propagation, then Mr, Craig’s plain moments seem 
to have a firmer hold on the exigencies of time.” 

It would appear then that the Fisher formulae and the Craig formulae are 
both needed. Georgescu (4) showed a partial connection between them in 
applying to the tn functions a combinatory analysis somewhat similar to that 
applied by R. A. Fisher to the k function. It is the purpose of the present 
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paper to work out a combmatorial procedure for a more general sample function 
so that either the Fisher or Georgescu combinatorial results come out as special 
eases. In making such a generalization no limitation is placed on the sample 
function except that it be rational integral and that all terms are of the same 
weight. Thus the results are applicable to mr, mr K, mjcr, etc. as well 

as to nir and K although they are not applicable to vmi. or In this way 

hr 

the inaportant formulae for the moments of a new sample moment function 
will be available by simple substitution as soon as any such new function is 
defined by a rational integral isobaric expansion of power sums. 

It is thus the purpose of this paper to determine the moments of a general 
moment function of the sample. This is done by keeping the multipliers of 
the various partitions of power sums indefinite until all manipulation is complete. 
It is then possible to assign the definite values of these multipliers which are 
associated with the desired sample function and to obtain the moment of 
the desired moment function in this way. Thus the Fisher result k( 42), and 
the Craig result Sii(i' 4 , V 2 ) are special cases of the new result Xii(/ 4 , /O- It 
is obvious that it is not possible to carry the results using these general moment 
functions as far as Fisher and Wishart (3), (6), (7), have carried the results of 
the decidedly advantageous (from the standpoint of simplicity of result) k func- 
tion and yet it is surprising to find the simplicity which can be obtained in 
the general case. Incidentally the introduction of the more general symbols 
clarifies the successive steps of the partition analysis which are somewhat con- 
fusing in any specific case because of the insertion of the value of the coefii- 
cients of the power sums in which the sample moment function is expressed. 

This paper is divided into three parts. The first part includes the necessary 
definitions, the basic formulae, and the general development of the algebraic 
method. In order to facilitate the algebraic work there is inserted a table giving 
the expected values of all possible partition products of power sums whose 
weight ^8. The second part deals with the different sample functions which 
might be used. The third part gives a list of the various partition formulae, 
of weight 5 8, which contain no unit parts and shows how these can be used in 
writing the chief variations of the formulae for moments of moments. 

Part I 

1. General Moment Functions. Different moment functions have been de- 
fined in various ways, but all moment functions have in common the property 
that they may be expressed in terms of the power sums. It appears sensible 
to use this expression in terms of power sums as the working algebraic definition 
of moment functions. For example the function fca , which is defined by R. A. 
Fisher to be that function of the sample whose expected value is the third 
cumulant (half invariant) is to be given the working definition of 

, ^ n(3) 3(2) (1) 2(1) (1) (1) 

* (n — 1) (w — 2) (n — 1) (n — 2) n(n — 1) (n — 2) 
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where the numerical expressions in parentheses indicate power sums of the 
sample. 

Every term in the definition of a sample function has a ^^weight'' which is 
equal to the sum of the power sums whose product is indicated by the term. 
Thus the weight of each of the tferms of h is 3. If all the terms of a given 
moment function have the same weight, the function is called isobaric and 
the weight of the function is equal to the weight of each term. Thus kt is an 
isobaric moment function and its weight is 3. Since all the functions so far 
proposed are isobaric we limit this generalization of moment functions to iso- 
baric moment functions although it is possible that a more complex analysis 
could be worked out for non-isobaric functions. 

Generality demands the inclusion of every possible partition product of 
power sums. Such generality can be obtained by writing 

fi * ol(l) 

/2 == ai(2) + au(l)^ 

fi = 03(3) + 021 (2) (1) + aui(l)* 

/i = 04(4) + 031(3) ( 1 ) + 02 *( 2 )^ -f- 02 ii( 2 )( 1 )* + 014(1)* 

and in general 

/r = Z (Pl)'’ (P»)'’ • • • (P*)" 

where {piY^ (PtY* ' • • (p^Y* indicates any partition product of power sums, 
Op'i . . . is its coefficient and the summation is taken for every possible parti- 
tion. The number of parts of the partition is p = Stt. It may be assumed, 
without loss of generality, that the partition is ordered, i.e. 

Pi S P2 Ps S ^ p- . 


A natural numerical coefficient of each term is the number of ways the r 
units can be collected to form the given partition. This value is given by 

/ ^ \ r! 

\pV pV PV)~ (Pi O'* (P 2 O'* • • • (p. !)'• ^ 1 ? 7r2! . . . TT.!’ 


If we set 


\pi ' ' • • P. 7 


the definition of ft becomes 




‘ • (p.)'* 


In the present paper the capital letters are used to represent the corresponding 
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functions of the universe as defined by the corresponding power sums of the 
universe. Thus 






...pT.(Pir ••• (p.r 


represents the corresponding function of the universe. In the case of the 
moment about the mean and the semi-invariant the Greek letters y. and X have 
been used to represent the corresponding function of the universe. In the 
case of functions whose notation is quite widely established, it is preferable to 
use the conventional notation, but in introducing new functions it appears 
wise to use the relationship between small and capital letters since the corre- 
spondence between the English and Greek alphabets is not exactly one to one. 
It should be particularly noticed that this notation does not agree with a pre- 
viously accepted scheme of using the small English letter to indicate the function 
whose expected value is indicated by the corresponding Greek letter. In the 
present paper it is not the expected value property which serves as the basis 
of notation but rather the definition of the function in terms of the partition 
products of power sums. 

2. The Working Definition of Moments About a Fixed Point. The sample 
fimctions defined by 






n 


are obtained from/, by placing 

[1 


flnTl . . . cT* 


n 


when 8 = 1 , TTi = 1, and p\ = r. 


1 0 in all other cases. 

The Greek y' is used to indicate the corresponding function of the universe. 

3. The Working Definition of Moments About the Mean* The moments 
about the mean are defined by 

m..-- 


n 


(3) 3(2) (1) , 2(1) 


+ 


n* ' n« ’ 


^ (4) 4(3) (1) . 6(2) (D* 3(1)* 

“ IT " + 
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and in general mr is obtained from/r by placing 


CtoTi . , . 


- if 8 « 1, Ti = 1, and Pi *= r. 
n 

(-ir* 

if Pi > 1, Fi = 1, s = 2, and p* x= 1. 
_ — if Pj ^ ^ and T\ = r. 


1 0 in all other cases. 

The corresponding moments of the universe are indicated by the conventional a*. 
For conciseness moments about the mean are referred to as “moments. 


4. The Working Definition of the Half Invariants. The half invariant 
moment functions of Thiele, as applied to the sample power sums are [see C. C. 
Craig (2, 7-10) and Frisch (12, 20-21)] 

/' 1 (1) 7 _ (2) _ (1) (1) , (3) _ 3(2) (1) , 2(1)’ 

^ n ’ n ^ ^ ^ n n? 

I ^ (4) 4(3) (1) 3(2)^ 12(2) (1)^ 6(1)* 

^ n n* n* 


and in general 


I'-'E 


(-ir'(p- 1 ) 


’ J(p.)'‘(p*)'’---(p.)'* 

\pi' ■ ■ ■ P. 7 


so that 


fl 1 . . . 


(-!)'-> (p - 1)! 


n" 


The corresponding moments of the universe are indicated, after Thiele (1) 
and Craig (2), by X. R. A. Fisher (3) used k while Georgescu (4) used s. 

In the present paper these functions are referred to as “Thiele moments.” 


5. The k Functions of R. A. Fisher. The k statistics of R. A. Fisher are 
defined in terms of the sample power sums by 


^4 


k[ 



Ir - J2L (1)' 

’ n — 1 w(n — 1)’ 


. n(3) 3(2) (1) 2(1)’ 

’ “ (n - 1) (n - 2) (n - 1) (n - 2) n<*> 

«(n + 1) (4) 4(n + 1) (3) (1) 3(2)’ 12(2) (1)’ 

(n - 1)(« (n - !)<*) (n - 2)»> {n - 1)<*> 


6(1)’ 
n«> “ 
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These values and values for ki and kt are given by R. A. Fisher (3, 203-4) 
while {dgebraic methods of attaining them are presented in sections 16, 17. 
They are referred to as Fisher moments. The corresponding functions of the 
universe, if used, would be represented by Kr. 


6. The h Fonctiott. Just as Fisher introduced a sample function whose 
expected value is a Thiele moment of the universe, so it is possible to introduce 
a function whose expected value is a moment of the universe. Such a function 
is defined by 


h 



ht 


n(3) 

(n — 1) (n — 2) 


( 2 ) nr 

n — 1 n(n — 1)’ 

3(2) (1) 2(1)* 
(n - 1) (n - 2) n»> 


, (n* - 2n + 3) (4) 4(n* - 2n + 3) (3) (1) 3(2n - 3) (2)* 

^ (n - 1)<« nW> 


6(2) (D* 3(1)* 

(n — ' 


Methods of obtaining the expansion of this function in terms of power sums 
are presented in section 18. The corresponding function of the universe, if it 
were used, would be represented by Hr. 


7. Otho: Moment Functions. It is possible to obtain an indefinite number of 
moment functions. For example one might define a function of weight 2 whose 
variance equals / 14 , (or nl). It is possible by the methods of this paper to 
find expressions for such moments. 

For reference purposes Table I is provided showing the values of a for each 
partition of weight <6 for the functions m', tn, I, h, k. The values of 

(prp*’ • ?>:•) 

are also inserted, in the left hand column, so that it is possible to read from the 
table the values for / = tn'r, ntr, Ir, K when r < 6. 


8. Products of / Functions. The product of two or more isobaric functions 
is also isobaric and of weight equal to the sum of the weights of the functions. 
Thus 

ftfi = [at(2) -|- Ou(l)(l)][oi(l)] = asai(2)(l) + OiiOi(l)* 
ftfi fl*®i(2)(l) “I" OiiOi(l) . 

In multiplying by /„ any term of /,, is of weight n and when it is multi- 
plied by any term of weight r», the result is a term of weight ri r%. 
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TABLE I 

Coefficients of Prodvcts of Power Sums in the Expansion of Different Moment 

Functions 




































































28 PAt7L a. DWYEB 

TABLE I — Concluded 


Numeri- 

cal 

00^- 

cient 

a 


Wr 

Ir 

hr 

hr 

10 

(Hii 

0 

1 

n* 

B 

2(n + 2) 

(n - 1)»> 

n* — 4n 4- 8 
n<« 

15 

Uni 

0 

0 


2(n — 1) 

(n - 1)<« 

, (2n - 4) 

+ n<« 

10 

Oaiii 

0 

-J. 

-6 

6 

1 

n* 

n*‘ 

(n - iy« 

(n - 1)<« 

1 

Uiim 


H 

■i 

II 

4 

n<»> 


R. A. Fisher [3, 207] used the product ^ 8^2 as an illustration of the algebraic 
method. The more general j?/* gives 

jUft = [^8(3) + 3a2i(2)(l) + aui(l)Yl<^(2) + aii(l)(l)] 

= a8a2(3)(3)(2) + U8aii(3)(3)(l)(l) + 6a3a2ia2(3)(2)*(l) 

+ [6a8Ci4iaii + 2a8aaaiii](3)(2)(l)* + 9aLct2(2)*(l)^ + 2a8aiiian(3)(l)^ 

+ [6a3iaiiias + 9a2iflii] (2)^(1)^ + l6a2iOinaii + 02ani](2)(l)® + aniaii(l)® 

which reduces to the value as given by him when the values of a are substituted 
from Table 1. 


9. The Expected Value of Any Partition Product. The expected values of 
partition products are well known and are indicated by 

E{pi) = n/xpi 

£?( 7 >i)(p 2 ) = + n(n - 

-®(pi)(P2)(p8) = “f" 1) [mpi+psMps "t" Mpi+p»Mpj *4" Mpj4P8Mpi] 

-f" n(jl 1) (?l 2) • 


and in general 


^ /vV vV • • • Vm*\ 

W'HPs)" • . . (p.)" » 2 X. X, X. ) • • • U.)’" 

\gi g* ••• Qt / 

(vV vV • • • p.'*\ 

• • • + x» and I I indicates the 

\gf‘ gj*» ■■■ qV J 


where t = xi + x» + x» + 
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nvunber of ways in which the partition p' ‘ p* * • - • pf* can be grouped to 
form the partition gf ‘ g» * • • • g?‘ . 

The continued application of the result above leads to a large niimber of 
formulae. In order to make these results accessible I presetat in Table II the 
expected values of all partition products of weight ^8. The essence of the 


table is the evaluation of the expression 


/pr‘p.'’-p:-\ 
\gf‘ g*** • • • gfv 


The niunbers 


at the top of each column indicate the subscripts of the p’s which must, of 
course, be multiplied by The entries on the extreme left are the numerical 
coefficients associated with each row. 


10. The Expected Values of the / Functions. With the use of Table II one 
is able to write expressions for the expected values of fr when r < 9. 

Pi(/i) = E{fi) = ainpt 

Pi(/*) == E(fi) = (a, + Ou)npi + aiin(n - 1 )mI* 

Pi(/i) = E(ft) = (o* + 3oii + aiii)nM( + 3(a«i + ani)n(» — l)MaMt 
+ auin(n — l)(n — 2)n'i etc. 

If the expected values of the / functions are expressed in terms of the moments 
about the mean of the universe, these formulae become, since pi = 0 

pJ(/i) = 0 

Pi(/*) = (o» + UiOnpi 

Pi(/s) = (®3 + 3 a*i + aiii)np* 

PiC/i) =* (04 + 4aji + 3oa + 6o*u + Oiiu)»P4 

+ 3(att + 2atii + Ouii)n(n — l)pt etc. 
These may be written more symbolically as 

Pi(/i) = 0 

Mi(/*) = hnut 
Pi(/s) = hjnpj 

PiC/i) btnut + 3i>j*n(n — l)Mt etc. 

11. The Expected Value of Products of / Functions. The expected value of 
products of / functions may be similarly found. For example 

pj(/,) * E{fi) = E[a.(2) + au(l)*]* - a|F(2)’ + 2a,OulS?(2)(l)(l) + oJ,E(l)*. 
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Table II can now be used by indicating a| as a multiplier of E{2f^ 2ataii as a 
multiplier of F(2)(l)(l) and au as a multiplier of (1)^. Then at once it is 
evident that 

Mi(/2) = + *2,a%an + + {dl + 202^^11 + 3ah)n(n — l)iU2 

= (02 + aii)*n/i 4 + [(02 + diiY + 2 aii]n(n — I)//* 

= hlnn^ + (62 + 26u)n(n -■ 1 )mL 

Similarly 

Mii(/8,/2) = + {hsbi + 3621&2 + &b 2 ibii)n{n — 1)m8M2 

= blufXi + (9&21 + 6bzb2i)n(n — 1)m4M2 + {bl + 96L)n(n — 1)^x1 
+ (9621 + 66iii)n(n — l)(n — 2)nl 

etc. 


where 63 ~ as + 8021 + am, 621 =021 + ctiii, bin = fliii* The important 
special cases are obtained by assigning the prom^r values to the a’s as given 
in Table 1. Thus 

M2(^) = -^ [(w — 1)^4 + (n* — 2n + 3) (n — I)//*] 

Th 

which agrees with the corrected result of “Student” in 1908 (8, 3) and Tchou- 
profl (10, 192). Similarly 

fi'iiimt, mj) = i [(n - 1)* (n - 2)^6 + (n - 1) (n - 2) (w* - 5n + 10 )ai»M2 ] 
n* 

ni(ms) = [(n - 1)® (n - 2 )*m» + (-6 n + 15) (n - 1) (n - 2)W* 

fir 

+ (n* - 2n + 10) (n - 1) (n - 2)V8 + (9n* - 36n + 60) (n - 1) (n - 2)/i|l 

etc. 


In the same way 


M = ^ + 

n 


j^nikzf ^2) = ^ 4 " 

n 


(n^ -2n + 3)/Lt2 
n(n — 1) 

(n* — 5n 4" 10 )m 3M2 


^2(^3) = ^ + 
n 


(~-6n + 15 )m 4M2 
n(n — 1) 


n(n — 1) 

(n ~“2n “I" I9)/i8 
n(n — 1) 


(9n^ ~ 36n + 60)^1 
n(n — 1) (n ~ 2) 


etc. 
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and 


w(t»i) - i + (n - 1 )mI] 
n 

Mn(w»J, mi) = Mw + (n - 1 )m*«] 

n 

M»(mi) = - [w + (n — 1 )m»] 
n 

etc. 


12. The Expected Value of the Products of / Functions in Terms of the 
Thiele Moments of the Universe. The formulae giving the in term if the 
X’s are 

Ms = X2 

M3 = Xs 

M4 = X4 + 3x1 

Ms = Xs + lOXsXa 

Ms — X® -j- I3X4XS “f" 10X3 "4" 15X2 


Mr = 



(Xp,r (Xp,r...(Xp,r 


where the summation holds for those partitions having no unit parts. See 
the results of Craig (2, 7-11) and Frisch (12, 21). It is at once possible to 
express the moment formulae in terms of the Thiele moments of the universe. 
Thus the general results above become 

M*(/s) = h\n\^ + [3&2^ + (62 + 26u)n(n — 1)]X2 

= btbiuX^ + [1068l>an + + 3b2ib2 + 6b2ibii)w(n — 1)]X8X2 

MsC/s) = biwXe + [ISbJw + (9bli + 6&8^2i)w(w — 1)]X4X2 

-j- [lObJn -f* (bj ”t" 9b2i)w(w — 1)]X8 

H- [ISbsn + (27b2i 18b8b2i)n(n — f) H* (Qbli + 6biii)n(Ti — 1)('^ — 2)]X2. 


13. The Thiele Moments of the /’s in terms of Thiele Moments. It is 

now possible to reduce to the Thiele moments of the /’s by means of the usual 
relations 

Mfr) = Mfr) - Mr) 

^llC/rij^rj) “ 9 fr%) 9 frt)t^lifri 9 fr^) 

X*(/r) = w(/r) - 3Mr)t^iifr) + 2Mr) 


etc. 
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BO that the results become 

X 3 (/i) * 63/1X4 -f- 2 [b 37 i -j- biiu(fi l)]Xs 
Xn(/«, ft) ®* 6363/1X6 + {3(6363/1 + btibtnin — 1)] + 6(6363/1 + b2ibnn(n —1)]}X3X3 
X3(/3) == 63/1X3 + {6(63/1 + b^biifiiu ~ 1 )] + 9(63/1 + b\iii(ii 1)]}X4X2 
+ 9 [ 6 j/i + liin(n — l)]Xj + {Q\b\n + 2 bjbtin(n — 1 ) + blin{n — 1 ) + bhn}*^] 
+ 6 [bln + ibhnin — 1) + 6ui/i(/i — l)(?i — 2)]}xj 

etc. 


The formulae as written are adapted to the partition representation of Part III. 
When the fs are equal to the m*& we have 

x.(™.) = fa 

n® /i* 

\ ^ (n — l)’(n — 2)Xt 6(n — 1) (n — 2)XsX2 

Xu(m,, m,) = + 

= (n-lYin- 2)\t 9(n - l)(n - 2 )*X 4 Xt 

n‘ n* 

9(n - 1) (n - 2)*X? 6(n - 1) (n - 2)Xj 

n* 


etc. 


which are the results as previously given by C. C. Craig (2, 55). In like manner 
when the /r = fcr 

X4 , 2X3 


X3(*3) = ^ + 


n /I — 1 

n /I — 1 

x,(*,) = ^ + 


6nXi 


/i /i— l /i— 1 (/I — l)(/i — 2) 

etc. 

as given by R. A. Fisher (3, 210] while 

X»(mJ) = -(X4 + 2 X 1 ) 
n 

Xn(/^8, = -(Xj + 9 X 3 X 3 ) 

n 

Xj(mi) * >(X 6 + 15X4X2 + 9X3 + 15 X 1 ). 
n 


etc. 
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14. Various Formulization of Results. Although different moment functions 
of the universe may be used it is customary to express the results in terms of 
universe moments about a fixed point, in terms of universe moments, or in 
terms of universe Thiele moments. It is p)ossible to express results in any of 
the nine forms 


fjL(fr) y in terms of ^ 

Hfr)l 


(moments about a fixed point (/x') 
I 

moments (m) 

[Thiele moments (\) 


where fr represents the isobaric sample moment function of weight r. One 
purpose of such varied formulization is to discover the most compact form 
and also the one best adapted to use in the case of a normal universe or a uni- 
verse whose moments obey some discoverable law. As suggested above Craig 
(2) has shown the relative compactness obtained by using X(mr) and Thiele 
moments of the universe while R. A. Fisher (3) has shown the great additional 
compactness obtained by taking = fcr* 


15. The Application of the Algebraic Method to Before leaving 

the algebraic method it is perhaps wise to outline the steps in the case of a 
more involved problem. We take the example which R. A. Fisher (3, 207) 
has used in the case in which fr — K. To find X 2 i(/ 3 , U)- 

The value of flf 2 was found in section 8. To find its expected value it is 
only necessary to enter the coefficients of the different partition products in 
this expansion at the left of the corresponding rows as indicated in Table II. 

The coefficient of any moment partition of the universe is found by multi- 
pl 3 dng each column entry by its corresponding left row entry and then by 
multi pl 3 dng by as indicated at the top. Thus the coefficient of jus is 

iSHdi "f" UaUii -f" 6<Z3(i2iU2 “H 6ct3(Z2iUii -f- 2(X3 Uiii(Z 2 -|- 9a2ici2 “f“ 2(Z3(Xiiiaii -f- 
+ 9a2iU2iflii 4" 6a2iaiiiaii + + aiiiau)^ 

which after some algebraic work reduces to 

(CLz + 3(121 4" Uiii)^((Z2 4” = b\b2tl- 

In this manner it is possible to write the result either in terms of universe 
moments about a fixed point or in terms of universe moments. If moments 
are used, one may neglect all column partitions involving unity. 

It should be noted that the a's defining K as given in Table I can be inserted 
here if desired. If these multipliers are introduced throughout the rows and 
columnar partitions involving unit parts are not used one will arrive at Table I 
of R. A. Fisher [3, 208] though there are some slight typographical errors in 
his rows for (3)^ (1)^ and (3) (2*) (1). 

Determining all the coefficients in this manner we find after considerable 
algebraic manipulation that 
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M«i(/a)/2) = blhntJkB + [blb2 + Qbhbt + 12bJ>iibii + 6b^ib2]n(n — l)ti^ 

+ pbjba + 1862 i^>* + 186 |ii)ii + Qbzbnbt + I 2 bzbiibn]n{n — 1 )m 6 M 8 
-f* [ 268&11 4 " 9 b 2 ib 2 ”h I8621&11 " 4 “ 66362 ib 2 ]w(^ — 1 )m 4 “f" [ 3662 ib 2 
+ biblibii + Qbzb^ibt + 12&8b2i&ii + I2bzbnibii + 7262itiii6ii 
-f“ 18biiibi]fi{fi — l)(w — 2)/ii4/L(2 4“ [btbz + Qbzb^ibi + 12bzbiibii 
+ 27blib2 + ^Ohhbii + 3662ibiii62 + 7252i?)iiil>ii + 366iii6ii]n(n — l)(w — 2)/ii8M2 
+ 1962iJ> 2 + 18621611 + 366216111611 + 66ni62 + 366 iii 6 ii]n(n — l)(n — 2 )(n ~ 3 )m*. 

If /r = kr the proper values of 6 are inserted and the expression above becomes 
that given by R. A. Fisher (3, 208). For example the coefficient of nt is 

( 9 n' - 63 n' + 240 n - 420 ) (n ~ 3 ) 
n 2 (n - 1)2 (n - 2 ) 


when 




1 

n(n — 1) 


621 = — 


1 

n{n — 1)’ 


6111 


^ 

n(n — 1) (n — 2)’ 


The algebraic results involved in changing the general formula above to 
other functions are too extended to present here. A symbolic means of attaining 
them is included in later sections of the paper. 


Part II, The Determination of Specific f Functions 

16. Functions Determined by the 6*s. In Part I it was shown how various/ 
functions are defined by giving definite values to the coefficients of the power 
sums. It is the purpose of this part of the paper to show how functions can 
be specified by means of their expected values in terms of moments of the 
universe. This is essentially the method used by R. A. Fisher in defining his 
k function and it is here extended to other functions. In this case the 6^s are 
first determined and the a^s are then found from them. The first moments 
of fij fiy U wore given in section 10. To these we add, as shown by Table II 

MiC/i) == {o,A + 4a8i + 3022 + 60211 + Uiiii)nM4 + 4(031 + 3o2ii + Oiiii)n(n — 1 )m8Mi 

+ 3(022 -f 2 o2ii + Oiiii)n(n — l)/i2^ + 6(0211 + Oiiii)w(n — l)(n — 2 )ju2Mi* 

+ ainin(n - l)(n - 2)(n - 3 )mi‘ 


etc. 
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These can be written more symbolically in terms of the 6^8 
fii(ft) binn 2 + biin{n — l)n'i 

MiC/s) * bzfifjLz + 3fetin(n — 1 )m2Mi + bmn(n — l)(n — 2 )mi* 

MiC/i) = ^4^4 + 463in(n — + 36Mn(n — l)ius* + 66«uW^*VfMi* + 

and in general 

= S ( ••• «••• • • • 0.;.)'*. 

\Pi pt • * p. 7 

The expansion of the function in terms of the power sums of the sample demands 
the determination of the a’s. This can be accomplished by solving the equations 

Ui = 6i 

o* + Oil == bi 

Oil = &11 

03 + 3 oji + Oiii = bz 

O21 + Oiii = 621 
Oiii = feiii 

04 + 4081 + 3022 + 6O21I + Ollll = 64 

Osi + 30211 + Ollll = bsi 

022 + 20211 + Ollll = 622 


etc. 

The solutions are 

01 = 61 

02 = 62 &11 

On = 611 

Oa = 63 — 3621 + 26111 

021 = 621 — 6111 
Oiil == 6111 

04 = 64 — 4631 — 3622 + 126211 661111 

Oai = 631 — 36211 + 261111 

022 = 622 — 26211 + 61111 
0211 = 6211 — 61111 

Ollll = 61111. 
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The values of o, , at least for r g 4 , follow the law 


0 - = S (p., . p_..) -» ' 


and 


(hi — 1^201^ where indicates that oi = 62 — &ii is multiplied by ai = fei, 
the rule of multiplication being suflSxing of subscripts. Similarly 022 = =* 

‘(b* - bn ) (62 - fell)! = b« - 26211 + 61111. 

This statement illustrates a general theorem which will be established later 
in another paper by a different approach that for all cases 

O' =s(p...’’.p..) 

and that 

Or, * * * Or/. 


This theorem enables one to write, with comparative ease, the coeflScient of 
any product of power sums in a sample function whose expected values is defined. 
For example the functional coefficient of ( 3 )( 2 ) in/s is 

*Oja2* = Kbs — 3621 + 26111) (62 — bii)* = 632 — bail — 36221 + 562111 — 26 mu 

while that of ( 3 )( 1 )( 1 ) is ‘osOiOi* = 6311 — 362111 + 26 iiiii . If the expected value 
of the function is known the 6’s are determined and the values of the above 
expressions can be found by substitution. 


17 . The Values of the Fisher Moments (A; functions). The k functions have 
been defined to be these functions whose expected values are the Thiele moments 
of the universe. Thus fi'iikr) = Xr and since 


^ \piW ••• p.7 


it follows at once that by comparison with MiCfr) in the last section, that 


(-ir‘(p-l)! 


Thus 


hi = h* = h„ = 

n n 


1 . 


h h h 2, 


b4 = 7 » 


1 

n' 


631 = 


-1 ^ 

n<2>^ 


622 = 


-1 


n' 


( 2 ) 


6211 = 


n' 


( 8 ) ^ 


-6 . 
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The insertion of these values in the formulae of section 16 gives the values of 
a such as those indicated in Table I and in section 5. Thus the coefficient of 
(3)(2)in/5i8 

10(l>« - ban - 3bB, + 6b*ui - 2biiui) = “ ^ ^ j 


10n“' 

(n - !)«>• 


The coefficient of (3)(1)(1) is 


10(b,u - Sbaui + 26mn) + 

L^(8) ^(4) ^(6)J _ jy4) 


18. The h Functions. It is also possible to define a function whose expected 
value is the moment of the universe. Thus Mi(^r) = Mr where 

= S f ••• • • • (mL)'* 

\pi‘ ••• P.7 


Ap^i ... p't — 


1 if s = 1, TTi == 1, and pi == r. 

( — 1/* if Pi > 1, TTi = 1, s =s 2 and p2 = 1. 

( — (r — 1) if Pi = 1, s = 1, and vi = r. 
1 0 in all other cases. 


Comparing with the value of Mi(/r) in section 16 we have 


• • • bl* = 




The substitution of these values of h in the results of section 16 gives the expan- 
sions of hr in terms of power sums as illustrated by the formulae of section 6 
and Table I. Thus the coefficient of (3) (2) is 

10(632 — 6311 — 36221 + 562111 — 2611111) 


10 _0 + + 0 + = 


-10(n - 2) 
(n - 1)M> • 


Similarly the coefficient of (3)(1)(1) in 65 is 


10(6,11 - 36,111 + 2611111) = 10 ^) + ^ + ^, 


3 , 8 1 10(n* - 4n 4- 8) 


19. The h' Functions. One line of attack calls for the introduction of new 
moment functions which will result in simpler formulae. Thus for example, 
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C. C. Craig wrote (2, 37) “It rather seems that the best hopes of eflFectively 
further simplifying the problem of sampling for statistical characteristics lie 
either in the discovery of a new kind of symmetric function of all the observa- 
tions which may be used to characterize frequency functions and which will 
be more amenable than either moments or semi-invariants for use in sampling 
problems, or in, what may very well prove to be much better and more 
feasible, the abandonment of the method of characterizing frequency functions 
by symmetric functions of all the observations altogether/^ 

R. A. Fisher has shown that it is possible to introduce symmetric functions 
which do simplify the resulting formula appreciably. It is the purpose of this 
section to introduce an additional symmetric function which simplifies the 
resulting formulae to a much greater extent. It is admitted that this function 
does not have all the properties (such as invariance with respect to change of 
origin) possessed by the Thiele and Fisher functions, but it does not have the 
property of making the resulting formulae simple. It also has the advantage 
that n(hr) == ti'ihi). 

The basic idea is to find a sample moment function whose expected value is 0. 
A first attempt, placing every 6 = 0, is of no avail since every a is also equal 
to 0 and there is no function. A second attempt is based on the idea of finding 
the function h whose expected value is If the universe is assumed to be 
measured about its mean, as is conventional, it follows at once that /u( = 0 
and = 0 so that 

finp(hri y 


This function then has the property that its moments about a fixed point and 
its moments are identical. 

In order to discover its expansion in terms of power sums, we note 

M) = 

and it follows at once by comparison with Mi(/r) in section 16 that 6ir = 

and bp*i = 0 in all other cases. The a’s are determined in the usual 

way. Thus 


= ?>2 — &11 


1 

n(n — 1) 


dll = bn 


1 

n(n — 1) 


so that 


h2 = 


1 


n(n — 1) 


((2) ~ (1)(1)]. 







fei « ;! 12(3) - 3(2)(1) + (1)*] 


= - A f6(4) - 8(3)(1) - 3(2)(2) + 6(2)(1)(1) - (1)*] 


and in general 




(-ir^[(pi-i)fr [(P2-i)!r ••• 


[(p. - 


1) !]'• 



In order to show the simple form in which results can be given we substitute 
the values of the i>’s in the results obtained above. Not only does /i(hr) = 0, 
but by section 11 


\s(hi) = jj^(hi) « ■ ■ fil 

Xii(A3 , ^ 2 ) = Mll(^3 1 ^ 2 ) = Mll(^3 ) ^ 2 ) = 0 

\2{hz) = ^i{hz) = ^l2{hz) = 


n(n — 1) (n — 2) 


3 

M2 


while from section 15 


\ rh' (h' Vj,' 36m 3M2 , 36(n — 3) Ai 2 

X2i(n8, / 12 ) = mKnz.hz) = M2iC/i8, * 2 ) = — 


n%n — 1)^ {n — 2) n^(n — 1)^ (n — 2)* 


It is to be noticed that these formulae contain very few terms and that the 
terms themselves involve very low moments of the universe. This simplicity 
has been attained without making any assumption such as normality, regarding 
the nature of the universe. 


20. Table of Values of b for Different Fimctions When r < 6. This process 
of defining functions by means of expected values could be extended indefinitely. 
Perhaps it has been applied to enough functions to suggest the breadth of the 
applicability of the theory developed in Part I and Part III. 

As the Vs are the quantities which are used in the formulae I have provided 
Table III giving their values for the six functions, mr, mr, ir, fcr, K when 
r = 1, 2, 3, 4, 5. When the a's are known, the 6’s are computed from them 
according to the formulae of section 16. 
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TABLE III 

Vahiea of the b’s forr^h 
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Part m. Combinatory Methods 

21. PartitionRepresentationof Expected Value of/ Functions. The formulae 

Ml(/l) * 

Mi(/») =* bunfjifi + 6iin(n — 1 )mi* 

Mi(/») = btHfit + 36*in(n — iVtMi + &niw(n — l)(n — 2)fii* 

Iti(fi) = htn/ii + 463in(n — 1 )msMi + 3fejsn(n — 1 )ms* 

+ 66*iin(n — l)(n — 2 )jujmi* + 
are “sjmthetically” given by the column partitions 
1 

2 1 

1 

3 2 1 

1 1 

1 

4 3 2 2 1 

12 11 

1 1 
1 

The partition parts represent both the subscripts of the moments and the 
subscripts of the 6’s. If p indicates the number of parts, the n multiplier 
is n*'*. The numerical coefficient is obtained by taking the sum of the entries 
in the column (the weight) and dividing it by the factorials of all entries times 
the factorials of all repeated entries as indicated by 

\ rj 

pjy (p« •)'' (pa O'* • • • (p. O'* iTj! ITS ! • • • JT, !’ 

The translation from the synthetic partition form to the expanded form is 
accelerated if the coefficients are known. These are provided in the following 
partition representation of the formula for niif,) when r g 8 and the results 
are expressed in terms of the moments of the universe 

Mi(/i); 0 

m(/a): ■ 1 

2 

m((/.): 1 

3 
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tiiifi) ■ 








MiW: 


1 3 

4 2 
2 

1 10 

5 3 
2 

1 16 10 16 

6 4 3 2 

2 3 2 

2 

1 21 36 106 

7 5 4 3 

2 3 2 

2 

1 28 66 35 

8 6 5 4 

2 3 4 


210 280 106 

4 3 2 

2 3 2 

2 2 2 

2 


The proper formula can be stated immediately from its synthetic representa- 
tion. Thus for example 

MiC/e) = beUfjLe + 15642n(n — 1)m4M2 + lOfraanCn — 1)m3 

“t“ 16h222n^(^ — f)(^ — 2)ybl2. 


22. Partition Representation of the Expected Value of a Product of / Func- 
tions. Two column partitions may be used similarly to represent the expected 
values of the products of two/^s, three column partitions for. the expected value 
of the triple product, etc. In order to obtain all terms it is only necessary to 
combine every partition of each / in every possible way. The synthetic repre- 
sentation of J?(m2, mi) is 

112 1 

21 20 11 10 
01 10 10 
01 

The sum of the entries in each row indicates the proper moment while the 
number of rows indicates the number of parts as in the preceding section. 
The n coefficient associated with a p rowed partition is then The b coeiffi- 
cient is indicated by the columnar entries. Thus 

Pu(/2,/i = bibiniu 8 + [6261 + 2 bubi]n(n — 1)m2Mi + biibin(n — l)(n — 2)pi*. 
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We verify ihis by the algebraic method 
Mn (/*,/0 « E{ [ 0 ,( 2 ) + au(l)(l)lMl))) 

* ^IotOi(2)(l) + OiiOiCl)*] 

« OfOifnMs + n(n — 

+ anai[n/z + 3 n(n + n(n - l)(n - 2 )mi'*] 

s* (02 + Oii)ain/u8 + + Ou)oin(n — 1 )m2Mi 

4“ 2oiiOi/i*Mi + ouOin(n — l)(n — 2 )ah* 
^ Ihbinfiz + ftjftiwCn -*• + 26n6in(n - 1 )m2Mi 

+ 6ii6in(n — l)(n — 2 )mi* 

as indicated. 

It thus appears that the partition representation is a mnemonic device for 
indicating the solution as obtained by the algebraic method. A more formal 
justification is based upon the property that if 

E(fz) = 62(2) + 611(1 )(1) and E{f^) = 6i(l) 

then E{ftf fi) can be obtained by a symbolic multiplication of 62(2) + 6ii(l)(l) 
by 61(1) where the Vs are multiplied but the power sums are collected in all 
possible ways. Thus 

£(/2,/i) - 626i[( 3) + (2)(1)] + 6ii6i[2(2)(l) + (1)*] 

which gives 

E(fi)fi) = bibitinz + btbin(n — l)M 2 fii + 26ii6in(n — 1 )m 2 Mi + 6ii6in^*Vi* 
as before. 

This symbolic multiplication is generally true and serves as the real algebraic 
justification of the partition representation. It will be established in a later 
paper dealing with the more general case of a finite population. The general 
type of partition analysis has been used previously by Fisher ( 3 ) and Georgescu 
( 4 ). Each has established it through analytic rather than algebraic means. 

23 . Determination of the Coefficients. Methods of determining the numerical 
coefficient have previously been given by such authors as Fisher (8), Wishart ( 5 ) 
( 7 ) and Georgescu ( 4 ). If the fs are of different weight, the coefficients of any 
partition (an interchange of rows is not looked upon as changing the partition) 
is given by writing in the numerator the factorials of the different r’s and in 
the denominator the factorials of all the different entries and the factorials of 
all repeated rows. Thus the coefficient of 


! 3 ! 2 ! 
!(1!)’2! “ 


210 

111 is 
111 
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In case two or more functions have the same weight additional equivalent 
partitions are formed by interchange of columns. The reader is referred to the 
above papers for rules for determining the coefficients in the more involved 
cases though the coefficients are presented for all the two way partitions of the 
next section. 

An alternative method of finding the coefficients is that given by C. C. 
Craig (2, 24-25) since it appears that the symbolic formulae used in the present 
paper are essentially his formulae for v’s in terms of X's. For example his for- 
mula for va (2, 22) is given s 3 rmbolically by the formula for 44 in the next 
section. The only difference revealed is that the subscripts of the X's are read 
by rows rather than by columns and that they are sometimes interchanged. 
The more precise formulization is needed for the present interpretation although 
it is not needed for Prof. Craig’s purpose. 

A third method utilizes the symbolic multiplication process stated in sec^ 
tion 22. Subscripts of the b’s are used to indicate which power sums are col- 
lected. Thus [6,(2) + 6u(l)(l)]* gives 

+ 6«j>o.(2)(2) + 2{2b,o6ii(3)(l) + bW>ou(2)(l)(l)l + 26n6n(2)(2) 

~l~ 46uobioi(2)(l)(l) 4* 6noo6wm(l)(l)(l)(l ) 

where the underscored terms indicate the products given by [6,(2)]*, 2[6,(2)] 
[6u(l)(l)], and [6n(l)(l)]* respectively. This' is represented by 


1 

1 

4 

2 

2 

4 

1 

22 

20 

21 

20 

11 

11 

10 


02 

01 

01 

11 

10 

10 




01 


01 

01 







01 


The underscored terms arc the only ones remaining when = 0. 

This method is especially useful when a large number of formulae are to be 
computed, as in the next section. 

24. The Partition Representation of Formulae of Total Weic^t ^ 8. The 

partition representation of when r g 8 are given in section 21. The 

partition representation of the remaining formulae of total weight ^ 8, which 
do not contain unit parts, are given below. 

22 112 
22 20 11 

02 11 

113 6 

32 30 12 21 

02 , 20 11 


32 
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42 

1 

1 

8 

6 

4 

6 

3 

12 







42 

40 

31 

22 

30 

21 

20 

20 








02 

11 

20 

12 

21 

20 

11 













02 

11 






33 

1 

6 

9 

1 

9 

9 

6 








33 

31 

22 

30 

21 

20 

11 









02 

11 

03 

12 

11 

11 













02 

11 







222 

1 

3 

12 

6 

4 

1 

6 

8 







222 

220 

211 

201 

111 

200 

200 

no 








002 

Oil 

021 

111 

020 

on 

on 












002 

on 

101 






62 

1 

1 

10 

10 

6 

10 

20 

10 

20 

16 

60 




52 

50 

41 

32 

40 

22 

31 

30 

30 

12 

21 





02 

11 

20 

12 

30 

21 

20 

11 

20 

20 











02 

11 

20 

11 



43 

1 

3 

12 

6 

1 

4 

12 

18 

12 

3 

18 

36 

36 


43 

41 

32 

23 

40 

13 

31 

22 

30 

03 

21 

12 

21 



02 

11 

20 

03 

30 

12 

21 

11 

20 

20 

20 

11 










02 

20 

02 

11 

11 

322 . 

1 

2 

4 

12 

3 

1 

4 

6 

12 

12 





322 

320 

311 

221 

122 

022 

301 

220 

121 

211 






002 

on 

101 

200 

300 

021 

102 

201 

111 





1 

2 

6 

12 

12 

12 

24 

12 

24 






300 

300 

102 

021 

201 

111 

210 

120 

111 






020 

Oil 

020 

101 

020 

on 

101 

101 

101 






002 

Oil 

200 

200 

101 

200 

on 

101 

no 





62 

1 

1 

12 

16 

6 

30 

20 

16 

20 






62 

60 

51 

42 

50 

41 

32 

40 

31 







02 

11 

20 

12 

21 

30 

22 

31 






16 

30 

120 

46 

10 

60 

120 

90 


16 

90 




40 

40 

31 

22 

30 

30 

30 

21 


20 

20 




20 

11 

20 

20 

30 

12 

21 

21 


20 

20 




02 

11 

11 

20 

02 

20 

11 

20 


20 

11 













02 

11 
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16 

10 

1 

16 

30 

10 

6 

30 



53 

51 

42 

33 

50 

41 

32 

23 

40 

31 




02 

11 

20 

03 

12 

21 

30 

13 

22 



16 

60 

90 

16 

30 

10 

30 

60 

90 

90 

46 

60 

40 

31 

22 

13 

31 

30 

30 

30 

12 

21 

20 

20 

11 

11 

20 

20 

20 

03 

21 

12 

21 

21 

20 

11 

02 

11 

11 

20 

02 

20 

02 

11 

20 

11 

11 

11 











02 

11 

1 

12 

16 

8 

48 

1 

16 

18 





44 

42 

33 

41 

32 

40 

31 

22 






02 

11 

03 

12 

04 

13 

22 


- 



6 

96 

36 

72 

48 

16 

72 

144 

9 

72 

24 


40 

31 

22 

22 

30 

30 

21 

21 

20 

20 

11 


02 

11 

20 

11 

12 

03 

21 

12 

20 

11 

11 


02 

02 

02 

11 

02 

11 

02 

11 

02 

11 

11 










02 

02 

11 


1 

2 

4 

16 

6 

4 

8 

4 

24 

16 



422 

420 

411 

321 

222 

401 

320 

122 

212 

311 




002 

on 

101 

200 

021 

102 

300 

210 

111 



1 

16 

6 

12 









400 

310 

220 

211 









022 

112 

202 

211 









1 

2 

16 

32 

12 

3 

24 

24 

48 

48 



400 

400 

310 

310 

202 

022 

211 

220 

211 

121 



020 

on 

no 

101 

200 

200 

200 

101 

101 

200 



002 

on 

002 

on 

020 

200 

on 

101 

no 

101 



8 

16 

12 

24 

12 

16 

48 

96 

24 

24 



300 

300 

210 

021 

120 

300 

201 

210 

111 

210 



120 

021 

210 

201 

102 

111 

120 

111 

111 

201 



002 

101 

002 

200 

200 

on 

101 

101 

200 

on 



3 

24 

6 

48 

24 








200 

200 

200 

200 

no 








200 

no 

200 

no 

no 








020 

no 

on 

101 

101 








002 

002 

on 

on 

101 









49 



40 PAXTL S. DWTSH 


1 

1 

9 

12 

6 

2 

18 

18 

6 

12 


332 

330 

222 

321 

312 

302 

212 

221 

320 

311 



002 

no 

on 

020 

030 

120 

111 

012 

021 


2 

9 

18 

6 








301 

220 

211 

310 








031 

112 

121 

022 








9 

18 

6 

12 

12 

18 

9 

72 

18 

36 


220 

220 

310 

301 

310 

202 

112 

211 

112 

211 


no 

101 

020 

020 

on 

no 

200 

no 

no 

101 


002 

on 

002 

on 

on 

020 

020 

on 

no 

020 


1 

'6 

12 

9 

18 

36 

36 

18 

36 

72 

36 

300 

300 

300 

210 

210 

210 

201 

201 

210 

210 

111 

030 

012 

021 

120 

102 

012 

111 

021 

101 

111 

111 

002 

020 

on 

002 

020 

no 

020 

no 

021 

on 

no 

a 

18 

36 

6 

36 







200 

200 

200 

no 

no 







no 

101 

no 

no 

no 







020 

on 

on 

no 

101 







002 

020 

on 

002 

on 







1 

4 

24 

24 

32 

3 

24 

8 




2222 

2220 

2211 

2201 

2111 

2200 

2011 

nil 





0002 

0011 

0021 

0111 

0022 

0211 

nil 




6 

12 

48 

96 

48 







2200 

2200 

2011 

2011 

nil 







0020 

0011 

0011 

0101 

1100 







0002 

0011 

0200 

0110 

0011 







24 

48 

96 

16 

48 

16 

32 





2001 

2010 

2100 

0111 

1011 

1011 

0111 





0201 

0201 

0111 

0111 

1110 

0111 

1101 


- 



0020 

0011 

0011 

2000 

0101 

1100 

1010 






1 12 32 12 48 
2000 2000 2000 1100 1100 
0200 0200 0101 1100 0110 
0020 0011 0110 0011 0011 
0002 0011 0011 0011 1001 
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25. The Fonnulae for the Sample Mommits about a Fixed Point in Terms 
of the Moments of die Universe. The partitions of section 21 and section 24 
can be immediately interpreted to give the formulae for the moments of the 
sample function. For example . 

MiiC/s,/*) = bzbintMi^ + {bJh + 352162 + Qb 2 ibn)n{n ~ 1 )m 8 A ‘2 

and the value of mmC/s , / 2) as given in section 15 can be read by inspection. 
The value of the 5^s are to be inserted for any specific function. The coeffi- 
cient of Ms in the expansion of msC/s) is 

(62 “h 662611 “I" 86u)n(ii — 1)(^ — 2). 


In case ft = m2, 62 = 


n ~ 1 


and 6n 


— ^ so that the coefficient is 


(n — 1) (n — 2) (n — 3n^ -|- 9n — 15) 
n* 


as indicated previously by Tchouproff (10, 192) and Church (9, 82). 

The partitions of section 21 give the 8 formulae Mr,(jv) which Tchouproff 
gave (10, 155). In this case/r == ml and every 6 is 0 except those having single 

subscripts and these equal 

n 

The partitions of section 21 give the formulae Vr , (jv) which were given by 
Tchouproff (10, 186). In this case it is only necessary to take ft = mr and to 
give the 6's the proper values. Tchouproff has arranged his results according 
to decreasing powers of n. As an illustration we derive his result for , (y) = 
M]i(m4). From section 21 

A* if/*) = biUtiA + 3622^(71 — 1 )m 2 


and from Table II 


64 


and 


SO that 



= jU4 + - (6 mI — 4jU4) — ^ (15/i| — 6 ^ 4 ) + A (9w — 8 / 44 ) 

n n* 

as indicated by him. 

The partitions of section 24 also give formulae which have appeared before. 
For example the partitions 

112 

22 20 11 
02 11 
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which e^nttboUse the formula 

r 

Mi(/*) “ hlnm + (6t + 2l>u)n(n — 1 )mI 

become 


[(n - Dw + (n* - 2n + 3 )mI] 

tlr 

which was early derived by ^‘Student^^ (8, 3) and Tchouproff (10, 192), Simi- 
larly the partitions of 222 and 2222 give the formula for and Ms(ms) and pLi{in%) 
which were given by Tchouproff (10, 192-193) and Church (9, 82). 

Sections 21 and 24 can then be used to write the moments about a fixed 
point of a sample function in terms of the moments of the universe. In the 
case of new functions the b’s must first be determined. Formulae involving 
unit columnar partitions are not included. If the formulae were desired in 
terms of moments about a fixed point of the universe, it would be necessary 
to write in addition all possible partitions. See for example the last formula 
of section 23. 

26. The Formulae For Moments of Any Sample Function in Terms of Ho* 
ments of the Universe. The partitions of sections 21 and 24 are also useful in 
writing the formulae for the moments of the sample moments. It is necessary 
to make the usual adjustments in changing from moments about a fixed point 
io moments: 

Mfr) = M*(/r) - Ml*(/r) 

Mu(/ri j = Mll(/ri f fr^ Mlo(/ri , /rj)M0l(/ri , /r*)- 

The particular two way partitions which are involved in this adjustment are 
immediately recognizable. They are the ones which have an entry which is 
the only entry in the row and in the column in which it is. Thus 3 gives 

220 

002 

one of the terms contributing to m 2 (/ 2 ) mi(/ 2 ). In addition its coefficient is the 
same, if sign is not considered, as the coefficient of M 2 (/ 2 ) Mi(/ 2 ) in the expansion 
of AtsC/ii) in terms of moments of . This has to be true since each is the number 
of ways of forming 220. And so in general the remaining function of n accom- 
002 

panying this adjustment is the product of the coefficient associated with 22 
and that associated with 2. The sign is plus when odd numbers of moments 
are multiplied and minus when even numbers of moments are multiplied. 
Hence 3 contributes —3n*6| to the adjustment to moments and the total 
220 
002 

contribution of 3 to the value of is dbl[n{n — 1) — ri] « — 36in. More 
220 
002 
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extensive study leads to the following general method of using the formulae of 
section 24. 

A. Write the coefficient of every two way partition according to section 25. 

B. Block off each single entry by drawing a line through its row and column. 
For example 

6 


The resulting partitions, 22, 2, 2 are called component parts. 

C. Form new partitions by eliminating component parts one at a time, two 
at a time, three at a time, etc. from the original partition in all possible ways. 

D. Form the coefficient of the resulting parts according to the methods of 
section 25. Multiply by (—1)*“^ where s is the number of resulting parts. 
The values of b will not change. 

E. Multiply in addition by s — 1 when the component parts are all taken 
separately. 

6 

As an example we find the contribution of the partition 2200 to the value 

0020 

0002 

of fiiift). It gives 

66Jln(n — l)(n — 2) — 3n’(n — 1) + 2n*]/i4MM* = I2nbtinnl. 

Similarly 1 contributes 
2000 
0200 
0020 
0002 

- 4nn‘” + 6n’(n - 1) - 3n‘]AiJ = 36j(h - 2)nt. 

We use the method in finding the coefficient of in the expansion of litimi). 
We find first the coefficient of in the expansion of msC/*)- It is indicated by 
the partitions 


1 

6 

8 

200 

200 

no 

020 

Oil 

on 

002 

Oil 

101 


so that the coefficient of nl is 

l4tn(n - l)(n - 2) - 3n*(n - 1) + 2n*] + 65,&!i[n(n - l)(n - 2) - «*(n - 1)] 
+ 86un(n - l)(n - 2) = bl(2n) + 66*6ji(- 2n* + 2n) 

• + 85iin(n — l)(n — 2). 
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When bt « — ^ - and 6u « — i this becomes as 

n* w* n® 

previously given by such authors as Tchouproflf (10, 194), Church (9, 82), 

Carver (Richardson) (11, 271). 

The general Tchouproff-Church formulae for the third and fourth moments 
of the variance may be written out in this way as may many other moment 
formulae which have not been printed. 

27. The Thiele Moments of the Sample Function in Terms of the Moments 
of the Universe. It is possible also to write the Thiele moments of the sample 
function in terms of the moments of the universe. The technique is very 
similar to that of the previous section. The basis of the transformation is 
now the formula for Thiele moments in terms of moments about a fixed point 
rather than moments in terms of moments about a fixed point. The results 
are the same as those of the last section when a double or a triple product of 
fs is involved, but they differ with the introduction of a larger number of 
products. The partitions having component parts are broken up into these 
component parts as before but the parts are combined in all possible ways. 
Multipliers are determined as before with the exception that there is a multi- 
plication by (— l)*”"^^ — 1)! where s is the number of resultant parts. Thus the 

2000 

term 0200 contributes — 4nn^*^ — 3n^(n — 1)* + 12n*(n — 1) — = 

' 0020 
0002 

to the value of X 4 (/ 2 ). 

28. The Moments About a Fixed Point of the Sample Function in Terms 
of the Thiele Moments of the Universe. We return to the problem of section 
25, only we wish to express the results in terms of the Thiele moments of 
the universe. We must use the formulae of section 12. 



where 1. 

Thus Mr will contribute to all partitions of r and inversely the contributions 
to a given partition are composed only of these terms which are obtained by 
combining the different elements of the partition. Since the numerical coeffi- 
cient in the expansion of Mr is the number of ways in which the r units can 
be collected to form the partition, it follows at once that the complete X coeffi- 
cient can be obtained by grouping the parts of the partition in all possible 
ways, determining the coefficient of each according to the methods of section 25, 
and adding. In this way the formulae of section 21 can be used to give expan- 
sions in terms of partition moments. For example the representation of mi(/6) 
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gives at once 

bfiiKt 15[&in -f- — l)]X4Xt -f- 10[6en 63sn(n — l)]Xs 

+ 15[6*>i + 36 <bw(w — 1) "1“ 5ss*ra(w — 1)(^ — 2)]Xj. 

The partitions of section 21 can be made to give the formula which 
were given by Thiele (1, 45-46). For example the formula for miCA) is indi- 
cated by 

1 3 

4 2 

2 


so that 


and since 


mI(/ 4) = bAtiXi + 3[64n + h&nin - 1)]X2 


(n — 1) (n^ — 6n + 6) 


and ^22 


Mi'(W = 


(n -1) (n^ - 6n + 6)X4 6 (n - Dxl 


which agrees with the result as given by him (1, 45). 

The two way partitions of section 24 can be used similarly. This device 
for changing to the X\s is due to the ingenuity of R. A. Fisher who applied it to 
the case where /r = fcr • 

As an illustration we write from section 24 the value of m*(/ 2 ) in terms of X’s. 
The partition representation 


gives at once 

62^X4 “f” [62^ -f" hifi(u — 1)]X2 * 4 “ 2 [ 52 ^ b\iTi(Ti — 1)]X2 

which agrees with the result of section 12. The other illustrations of that 
section may be written out similarly. 

As a final illustration of this technique we find the coefficient of xj in the 
expansion of M 2 i(/ 8 , /i). The partitions are 

2 9 18 6 

301 220 211 310 

031 112 121 022 
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and tile coefEicient is 

ZCbilw + hlbuMn - 1)] + 9[6!8tn + 6|i6in(n - 1)] 

+ ISlblbin + bjibiin(n — 1)] + 6[6*6jn + 6i6*i6*»(n ~ 1)]. 


If the d’s are inseHed to form the A;’s, the first and last terms become 0 and the 


others give 
(3, 208). 


27n - 45 
n(n — 1)** 


This agrees with the value as given by R. A. Fisher 


29. The Moments of die Sample Ftmction in Terms of the Thiele Moments 
of the Universe. The partition representations of section 21 and section 24 
can be used similarly to write formulae for the moments of the sample function 
in terms of the Thiele moments of the universe. It is only necessary to use the 
general plan of section 26, but to write the coeflScient of every resulting parti- 
tion according to the method of section 28. For example the partition 


pves the coeflScient 



+ 4n^*^ + — iblW + 3n^(n — 1) + n*(n — l)(n — 2)] 


4* 66j[n* + n\n - 1)] - 36^ == bt[n" ~ 4n" + 6n* - 3n ] « 0 . 


30. The Thiele Moments of the Sample Function in Terms of the Thiele 
Moments of the Universe. The partition representations of section 21 and 
section 24 can also be interpreted to give the Thiele moments of the sample 
function in terms of the Thiele moments of the universe. The scheme is 
similar to that of section 29 except that the formulae for changing to Thiele 

2000 

moments are used as in section 27. For example the partition 0200 has now 

0020 

0002 

associated with it 

biln 4“ 4n^*^ 4“ 4“ 6^^*^ 4" — 462!^* 4" — 1) 4* n*(w — l)(w 2)] 

— SbW{n — 1)* 4- 1262 [n® 4- w* (n — 1)] — 662^^ « 0. 

The application of this method enables one to write the formulae of section 13 
(and Qthers which they t3rpify) with relative ease. It is now possible to com- 
plete the task left unfinished in section 16. We do not take the space necessary 
to write all the terms of X2i(/«, /2) since the lengthy expression can be obtained 
quite readily from the representation of section 24. One term, say the coeflS- 
cient of X6X2, is represented by 
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1 9 12 6 

330 222 321 312 

002 110 on 020 

and gives 

9[^6jn + 6*i6fw(n — 1)] + 12[bs6in + b»b»i6un(w — 1)] 

+ 6[b*btn + b^ibin{n — 1)] 

21 1 1 

which becomes rr when bs = 6 * = - and bsi = bji = This 

n(n — 1 ) n n(n — 1 ) 

agrees with the result given by R. A. Fisher (3, 209). 

For simplicity of form it is logical to use this formulization of results, Thiele 
moments in terms of Thiele moments, and it has been used by Thiele ( 1 ), 
Craig ( 2 ), Fisher (3) and Georgescu (4). They however have used different 
sample moment functions. Thiele and Gteorgescu used the Thiele moments 
of the sample, Craig and Georgescu the moments while Fisher introduced the 
k function. 

The present discussion deals with the corresponding partition moments of 
any rational integral isobaric moment function of the sample. The results 
indicated here give many of the results of the previous authors as special cases. 
For example the symbolic formula 44 of section 24 gives the of Thiele 

( 1 , 46), the Sm{pt, Vi) of Craig ( 2 , 57), the *(44) of R. A. Fisher (3, 210) as 
special cases when the formula 44 is given the interpretation of this section. 

Some may prefer the Craig attack ( 2 , 21-36) to the partition method. It 
should be noted that the formulae of sections 21 and 24 can be used in place 
of part of the Craig method. Thus his formulae (2, 22) 

I'M =® Xso + 28 XwXao + 56 XwXm + etc. 

vt* = X 44 + (12 X42Xoi + 16 XjjXu) + etc. 

are immediately obtainable from the symbolic formulae by writing X’s in place 
of b’s and by using row, rather than column, subscripts. It is then necessary 
to compute the values of Xt,*, ... as given by him (2, 16-17, 40) and to insert 
in his expansions of <S«(i'm, ><„) in terms of v’s. For example 

Su(yi, Vi) = - [rjo + (n — !)»>« — Wsoro*] (2, 32) 

n 

and from the sjnnbolic formulae of sections 21 and 24 
vto = Xw + lOXsoXao 
vsa = Xa 2 + XaoXoa + 8 X 12 X 20 + 6 X 21 X 11 
vzf) = Xso 
V 20 * X 20 
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SxiCj's , V 2 ) == ~ [Xw + (w 1)X82 + 9 X 30 X 30 + (w ““ 1)(6X2 iXii + 3 X 21 X 30 )] (2, 30) 

n 

which agrees with that given by Prof. Craig (aside from an obvious typographical 
error). The insertion of the values of X gives the value as indicated by 
Xii(w 3 , m 2 ) of section 13 and by the first method of the present section. 

31. Special Rules for the Determination of the Coefficients in the Case of 
ffie Fisher and Georgescu Analyses. R. A. Fisher (3) gave a number of simple 
rules which assist greatly in the determination of the coeflScients accompan}dng 
the partitions. Georgescu (4) also introduced special rules for the evaluation 
of the coefficients of the different partitions he used. It is not to be expected 
that all these rules are applicable in the more general case under present con- 
sideration, but the vanishing of such coefficients as that of 2000 leads one to 

0200 

0020 

0002 

suspect that there mig^ be some rules which are applicable to this general 
case. A sensible method of procedure is to examine the rules of Fisher and 
Georgescu and determine if they hold in the more general analysis. The special 
rules of R. A. Fisher might be given somewhat as follows. 

A. If a partition has a column with a single entry, that column may be 
eliminated and the factor rT^ introduced. 

B. Any partition having a row with a single entry may be neglected. 

C. ‘‘We may exclude any partition in which any set of rows is connected 
to its complementary set by a single column only.^’ 

D. In determining the algebraic coefficient of a partition the “pattern^^ is 
sufficient and precise entries are not needed. Thus the partitions 21 and 35, 

11 42 

although they have different numerical factors, have associated with them the 
same function of n. This value is indicated by the pattern xx which has asso- 

XX 

ciated with it the function — . As a result of this property Fisher was able 

n — 1 

to provide a table (3, 223''226) of useful patterns which is of great assistance 
in writing the value of the coefficients. 

E. Formulae of moments of k functions involving ki can be derived from 
corresponding formulae not involving ki, “The effect upon the corresponding 
formula of adding a new unit part to the partition is (1) to modify every 
term in the formula by increasing the suffix of one of its k functions by unity 
in every possible way, and (2) to divide the whole by (3, 206). 

Two of the important Georgescu rules may be stated. 

A'. The^ numerator function (aside from numerical coefficient) is not altered 
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if columns are changed to rows and vice versa. Thus the coefficient of 4 in 
n»^ coefficient of st in 5(2*) is Georgescu 

ha« replaced nhy N + 1, 

B'. All partitions which can be broken up into component parts have coeflB.-^ 
cients of 0. This is extended to include all partitions which have as component 
parts other partitions. Thus 

2100 

1100 

0012 

0034 

has a coeflBcient 0 as does the equivalent 

2010 

1010 

0102 

0304 

32. Special Rules for the Determination of the Coefficients in the More 
General Case. In the more general case we have 

A. If a partition has a single column with a single entry, c, that column 
may be eliminated and the value be inserted as a multiplier. This is imme- 
diately evident since the contribution of that column to each term in the 
expansion is be times its value if the column were eliminated. 

B. The coefficient of any partition having an entry which is the only entry 
in its row and column, is 0. 

This rule, which saves considerable labor in that it makes unnecessary the 
computation of the coefficients of many of the partitions of section 24, is estab- 
lished in this way. Without loss of generality the partition may be repre- 
sented by 


Cll 

Cl2 

Cn • 

• • Cu 

0 

C%1 

C%2 

C28 • ‘ 

• • C2v 

0 

^ w + 1.,+1 ~ Csi 


Css • 

• • Csi, 

0 


Cu2 

C«s • 

’ * Cuv 

0 

0 

0 

0 

0 

Ctt+1 


and Tu » V may represent the partition containing the first u rows and the first v 
columns. We determine the coefficient of Tu+ijv-^i in terms of the coefficient 
of TTu . V . Consider first any grouping of the u rows of Tu,v into w rows. There 
will be w corresponding groupings of Tm+i . v-f i in which the last row is added, in 
turn, to each of the w rows and another m? + 1 rowed term in which it is not 
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added. Iq ea«h of the first w cases the coefficient by rule A is multiplied by 
I . •-i-i • In the case of the ta + 1 rowed partition the coefficient is multi- 
plied by 6«,+, , , and n*"’ is replaced by A final adjustment takes 

care of the transition from the moment about a fixed point of the sample 
function to the. Thiele moment of the sample function. This adjustment de- 
mands the multiplication of the coefficient of *■«,, by l>e,+, .,+i nand the sub- 
traction from the sum of the other terms. If B„ is the coefficient the w 
rowed form, it follows at once that the corresponding coefficient is 

- n = 0. 

This holds for the expansion of any term of «■» , v and hence the coefficient of 
Vv+i.K+i is 0. Of course the argument holds if the partition has more than 2 
component parts. 

It thus appears that this rule holds not only for k, and vir as Fisher and 
Georgescu have noted, but for/,. 

C. The coefficient of any partition which can be broken into component 
parts is 0. In this sense a component part is any group of rows or columns 
which have no entry in common with any other group of rows or columns. 
It corresponds in matrix language to a matrix which results when one matrix 
is aero bordered by another matrix although rows and columns may thereafter 
be interchanged. 

The proof of this more general case follows the general line of the simpler 
case although the reasoning is more complicated. For example the coefficient of 


Cll Ci2 • • • Cu 0 0 

Cji C*2 • • • Cjr 0 0 

^81 082 ... Csv 0 ® 

Cu2 • • • Cuv 0 0 


0 0 • • • 0 c„4i , v-1-1 Cu-f-1 , r-f 2 
0 0 • • • 0 Ctt4.2 , v+l Cw+2 , v+2 

is 0 since any w rowed term of the Vu , v contributes 
Bwfccu+l . •+! ^C|*+l . v+a+Cu+S » r+2 ^ —.nn ] 

+ [«>(w - 1) n'"’ + 

- n(n - 1) = 0. 

Other special rules of Fisher and Georgescu do not hold in the general case. 
Thus Fisher rule B is not generally true since the partitions 


12 and 22 
30 20 
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have respective algebraic coefficients of bjan + bnbin(n — 1) and 

bibin, + &8&in(n — 1) 

and these are not in general equal to 0. 

The fisher rule C is replaced by the somewhat less general C of the present 
section. 

The Fisher rule D is not applicable in the general case. The Fisher rule D 
is applicable in all cases in which the value of the ••• is completely deter- 
mined by the number of parts for in this case the particular value of each 
part is not pertinent. We may say then that the Fisher rule D is applicable 
to all cases in which bp\ip\* is a function of p, n where p is the number 

of parts. This condition is satisfied by bp•^t . . . p** = ^ and the 

coefficients are worked out for it in Fisher’s paper. The same method is 
applicable to other functions satisfying the general condition although the 
values of the coefficients will of course vary with the definition of b. 

The Fisher rule E is not applicable to the general case. Its validity, from 
an algebraic standpoint, depends upon the Fisher property B which is not 
generally applicable. The Fisher rule E as applied to the more general case 
gives correct terms but it does not give all the terms. For example the Fisher 
rule E applied to Xs(fc*) gives 

n n — 1 

The application of a corresponding rule to 

— bin\i -|- 2[6jw -f- biinin — 1)]X| 

would give 

Xji(/2,/i) — hjhiwXs 4[6*6in -I- b^binin — l)]XsX3 
while the correct result is indicated by 

14 2 4 

221 210 201 111 
011 020 110 


and is 


Xji(/*/i) = hJbinXi -j- 4[6|6in *1- 6si)ii6in(w — l)]XjXj ■+■ 2[6|6iw + bibin(n l)]XiXs 
+ 4[6t6in + b\ibin(n — l)]XjX*. 
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Tlie difference is due to the vanishing of the two middle terms in the ease of 
the k functions. 

The rule B', which Georgescu found most useful in computing and checking 
his formulae, is not generally true. It is not even true in the case of the k 
function, as, can be discovered by using it on the list given by R. A. Fisher 
(3, 210). It is interesting to note that the Georgescu method, while not being 
able to utilise many of the special rules of the Fisher method, does use this rule 
which is not in general adaptable to the Fisher method. 

33. Special Rules in the Case of the Functions. Special rules can be 
worked out for other sample functions. As an illustration we examine the 

function hi which was defined in section 19. It is recalled that bip = and 

that bp^i . = 0 for all other cases. It follows at once that 

A. Any partition having any entry other than unity (or zero) may be 
neglected. 

B. The value of bip is 

As an illustration we write the value \n{hzj hi). From the partitions of 
section 24 we select 


36 


36 

111 


110 

111 

and 

110 

110 


101 



oil 


as being the only partitions making a contribution. The result of section 19 
follows at once. 

34. The Case of a Normal Universe. A normal universe is characterized by 
the relationship that Xr = 0 when r > 2. It follows that it is only necessary 
to compute the coefficients of those partitions giving powers of X 2 . 

Wishart (5) (7) has developed the partition analysis of the k function in 
the case of a normal parent while Georgescu has studied the corresponding 
m function. It is not the purpose of this section to make extensive study of 
the case of the normal parent but simply to indicate that .the results of section 24 
are immediately applicable. As an illustration we write the values of Xi(/ 2 ), 
and \i{f 2 ) in the case of a normal universe. The terms are given 
successively, by 


1 

2 

8 

48 

2 

11 

110 

1100 


11 

on 

0110 



101 

0011 




1001 
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and hence 
Xi(/») = hiiikt 

=“ 2[b\n + biin{n — l)]x| 

Xs(/t) Sfbln -|- 36*6iiw(w — 1) biitiin — l)(w — 2)]X{ 

X4(/t) “= 48[6»n + 6!)*6n«(n — 1) + 6iin(n — 1) + 4686nn(n — l)(n — 2) 

+ 26uw(n — l)(n — 2) + bnn(n — l)(n — 2)(n — 3)]Xj. 

It is only necessary to substitute the b's to obtain the results for different values 
of /. This is done in Table IV. 

TABLE IV 


The first four Thiele moments of ft for various sample functions in the case of a 

normal universe 


Sample 

func- 

tion 


Mfi) 


Mfi) 

m2 

n 

2(n — 1) ^2 

8 (n - 1 ) .,3 
n» 

48(n - 1) Xj 

n* 

kt 

Xa 

2 x 2 

n — 1 

8 X 2 

(n - D* 

48X2 

(n - 1)» 

k 

n 

2(n - 1) 2 

A 2 

n‘ 

8(n - 1)X5 
n® 

48U-l)Xj 

n* 

m2 

X 2 

2 X 2 

n 

8X* 

n* 

48 X 2 

h 

X 2 

2 x! 

n ^ 1 

8Xi 

(n - 1)* 

48Xj 

(n - 1)» 

hi 

A 

2 X 2 

8(n - 2)Xj 

48(n* - 3n + 3)Xj 

/t2 

u 

k 

n(n — 1) 

n*(n — 1)2 

n®(n — 1)® 


One surmises that the general value of 

X,(/ 2 ) is 2’^‘(r - 1)! XlB: 11000 • • • 0 

01100 -O 
00110 • • • 0 

00000 •••11 
10000 • • • 01 
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whara B representB the b coefficient of the r rowed partition. This induction 
aiders consutent with the fact that 


Xr+lCfci) 


(n - I)' 


as shown by John Wishart (7). The whole subject of the Thiele moments of 
the general function in the case of a normal universe would make an interesting 
subject of investigation. 


35. Summary and Conclusion. The contributions of this paper include 

1. The definitions of specific moment functions in terms of power stuns. 

2. The use of indeterminate multipliers in representing a general isobaric 
moment function. 

3. The finding of the expected value of products of these functions by alge- 
braic methods. 

4. The use of tables in writing these expected values in terms of moments 
(or of moments about a fixed point) of the universe. 

5. The finding of the expected values of specific moment functions by sub- 
stitution. 

6. Means of establishing the expansion of new moment functions which are 
defined by their expected values. 

. 7. The introduction of the sample function of weight r whose expected 
value is Air. 

8. The introduction of the sample function of weight r whose expected 
value is n'l. 

9. The two way partition formulae of weight ^ 8 which do not involve 
unit parts. 

The use of these partition formulae in writing: 

10. The moments about a fixed point of /, in terms of moments. 

11. The moments of /, in terms of moments. 

12. The Thiele moments of /, in terms of moments. 

13. The moments about a fixed point of /, in terms of Thiele moments. 

14. The moments of fr in terms of Thiele moments. 

15. The Thiele moments of /, in terms of Thiele moments. 

16. Special rules in the case of Thiele moments. 

17. The applicability of these results to a given sample moment function 
and hence the derivation of varied results, of such authors as Thiele, Tchouproff, 
Church, Fisher, Craig, and Georgescu, from the same partition formulae. 

18. The simplicity of the formulae when hr is used as the sample function. 

19. The application of the synthetic formulae to the Craig method. 

20. The applicability of the theory to a normal universe. 

The introduction of such general procedure opens up a wide field for future 
study. It is impossible in a single paper dealing with so broad a subject to do 
more than to outline the general scheme by which two way partitions can be 
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used as a central formulization of the various formulae for moments of moments. 
More detailed proofs and more extensive analysis of the more important of the 
special cases will undoubtedly be supplied by later writers. 

In later papers the author will show how the partition representation can 
be used in the case of multivariate distributions and how it can also be used, 
in connection with the sampling polynomials introduced by H. C. Carver (11), 
to represent the more complex formulae obtained in the case of finite sampling. 

It is obvious that the author is indebted to the classical moment studies of 
Fisher and Craig. He also wishes to acknowledge his indebtedness to Prof. 
Craig and to Prof. Carver who have read the manuscript and have made 
valuable suggestions. 

The University of Michigan. 


REFERENCES 

(1) Thiele, T. N. ^Theory of observations.’' London 1903. 0. and E. Layton. Re- 

printed in “Annals of Mathematical Statistics,” 2 (1931), pp. lOo-SOC. 

(2) CvRAiG, C. C. “An application of Thiele’s semi-invariants to the sampling problem.” 

Metron, 7 (1928-29), pp, 3-74. 

(3) Fisher, R. A. “Moments and product moments of sampling distributions.” 

Proceedings of the London Mathematical Society. Series 2, vol. 30 (1930), 
pp. 199-238. 

(4) St. Georgescu, N. “Further contributions to the sampling problem.” Biometrika, 

24 (1932), pp. 65-107. 

(5) WiSHART, John. “The derivation of certain high order sampling product moments 

from a normal population,” Biometrika, 22 (1930), pp. 224-238. 

(6) Soper, H. E. “Sampling moments of moments of samples of n units each drawn 

from an unchanging sampled population, from the point of view of semi- 
invariants.” Journal of Royal Statistical Society, 93 (1930), pp. 104-114. 

(7) WisHART, John. “A comparison of the semi-invariants of the distributions of 

moment and semi-invariant estimates in samples from an infinite population.” 
Biometrika, 25 (1932), pp. 52-60. 

(8) “Student.” “The probable error of the mean.” Biometrika, 6 (1908), pp. 1-25. 

(9) Church, A. E. R. “On the moments of the distribution of squared standard devia- 

tions for samples of n drawn from an indefinitely large population.” Bio- 
metrika, 17 (1925), pp. 79-83. 

(10) Tchouproff, a. a. “On the mathematical expectation of the moments of frequency 

distributions.” Biometrika, 12 (1918-19), pp. 140-109, 185-210. 

(11) Carver, H. C. “Fundamentals of sampling.” Editorial. “Annals of Mathe- 

matical Statistics,” 1 (1930), pp. 101-121; 260-274. 

(12) Frisch, Ragnar. “Sur les semi-invariants et moments employes dans I’etude des 

distributions statist! ques.” Skrifter utgitt av det Norske Videnskaps-Akademi 
i Oslo. II Hist.-Filos Klasse (1926), no. 3. 



NOTES 


A COEFPICIEKT OF COKIUBLATION BETWEEN SCHOLARSHIP 

AND SALARIES 

INTRODUCTION 

Some might doubt that it is correct to apply a coefficient of correlation to 
show the relationship between scholarship and salaries. This coefficient can 
be trusted to give at least a rough approximation, which is all that is necessary 
in the inexact science of vocation. It is fictitious accuracy to be too finical 
in the application of formulas. Therefore, a coefficient of correlation between 
scholarship and salaries is a valuable part of human knowledge. 

Would it be worth while to find this coefficient if it is based upon the experi- 
ence of the American Telegraph and Telephone Company? Since the employ- 
ment practices of this company are not representative of the employment 
practices of business at large, one might doubt the validity of drawing general 
conclusions from such specialized data. The coefficient for business at large 
, is probably less than the coefficient for the Bell System; the value of this knowl- 
edge is enhanced if we know the latter coefficient. Since this company is very 
large, a coefficient between scholarship and salaries would be valuable, even if 
this coefficient applies only to the Bell System and to other companies having 
approximately the same employment practices. 

An article^ by Mr. Walter S. Gifford, President of the Bell System, contains 
a discussion of some of the relationships between scholarship and salaries. 
President Gifford, however, did not determine in the case of the Bell System a 
coefficient of correlation between scholarship and salaries. 

The purpose of this article is not a new contribution to statistical method, 
but is an application of the method^ of finding the coefficient of correlation 
when the two variables have not been quantitatively measured. This method 
will be applied to the chart on page 672 of President Gifford's article, in order 
to determine for the Bell System the coefficient of correlation between scholar- 
ship and salaries. 


FINDING THE COEFFICIENT OF CORRELATION 

An explanation of the chart. It is based on the experience of 2,144 Bell 
System employees over five years out of college. First, assume these employees 

U entitled “Does Business Want Scholars?’* and was printed in the May 1928 issue 
of Harper’s Magazine. 

* It can be found in Elderton’s “Frequency Curves and Correlation.” 
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are grouped according to their grades in college. In the high scholarship group 
put those who graduated in the highest third of their classes. The middle 
and low scholarship groups are formed in like manner. Secondly, suppose the 
same employees are divided into three equal groups according to their salaries. 
Then, the salary of any one of the employees would be high, middle, or low. 

Assume a hypothetical group of 300 employees who are college graduates. 
Suppose that the scholarship of 100 of them was high, that the scholarship of 
100 of them was middle, and that the scholarship of the others was low. Also 
assume that the salary experience of these 300 employees is the same as that 
of the 2,144 employees of the Bell System. 

The 300 employees can be grouped according to the following table. 


TABLE NO. 1 


Salary 

Scholarship 

Totals 

Low 

Middle 

High 

High 

22 

24 

48 

94 

Middle 

31 

39 

27 

97 

Low 

47 

37 

25 

109 

Totals 

100 

100 

100 

300 


This table can be combined as follows. 


TABLE NO. 2 



Scholarship 

Salary 




Low & Middle 

High 

High 

C 

d 

Middle & Low 

1 

b 


Then, c = 46, a = 164, d = 48, and h = 52. Assume N = 300. 

Assume x is a function of grades received in college. Suppose y is a function 
of salaries received. Assume that the frequencies x and y both follow the 
normal curve of error whose standard deviation is equal to one. Also assume 
that the average of x and the average of y are both equal to zero. It is a 
matter of common knowledge that salaries are not arranged in a symmetrical 
fashion; y is not a linear function of salaries. 

In the formulas which follow, r is the symbol for the coefficient of correlation. 
These formulas are applied to Table No. 2. We have 

]= - - = -167, and A = .4316. 

y/2ir Jo 2iV 
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Also 

r c"*** dp = “ -187, and A; = .4874. 

Jo 2Ar 

Then, 


H = 


1_ 




.3635, and K 


1 /.-**• 


.3543. 


All the quantities except r in the following approximate equation are known: 

+ 4 - 3)*(A:* - 3) + ~ (5* - 6A* + 3) (k* - Qk* + 3). 

4 U ^ IdbO 


Therefore, 

.0261/ + .0681/ + .1034/ + .1052/ + r - .4314 = 0. 

Then, r is approximately equal to .4051. Consequently, for practical purposes 
we can assume that r == .4. 

28 Boodt Street John L. Roberts 

♦ Brunswick, Maine 


NOTE ON THE DERIVATION OF THE MULTIPLE CORRELATION 

COEFFICIENT 


Consider N observed values of each of n variables. These n-N yalues may 
be tabulated in a double-entry table as follows: 

Xn Xn Xi3 . . . Xiu 

X21 X22 - 3^28 * • * Xzif 


Xnl Xfi 2 Xn3 


.. X 


nff 


where Xik is the value of the variable. 

Using the variable as the dependent variable, the general linear relation- 
ship between the n variables may be expressed by 

= <ax a:i + iOa 0:2 + • • • + <a»-i Xi^i + <0,4.1 Xi^i + * • • + »Un (1) 

where 

<a,* is the general parameter which is to be determined empirically; 

Xf « X, - Mr, 

Mi is the arithmetic mean of the variable. 
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By the method of least squares, the constants of (1) must satisfy the normal 
equations; 

(2*J)<0i + (Sxia:i)<a* + • • • + 

+ (S®ia;,+i),oj+i + • • • + (Sa:ia;„)<o« = 2*1*. • 
(2xtXi)iaj + ( 2 x 2 ), Oa + • • • + (2x2X<_i)<o<_i 

+ (2X2 X<+i),- 0 <+i + . . . + ( 2X2X„)<an = 2X2Xj 


(2xi_iXi)<ai + ( 2 x<_iX 2 )<a 2 + • • • + (2xj_ix„)<a„ = 2x<_ix< 
(Sx,-+iXi)<ai 4* (2Xj+iX2),-02 + • • • + (2x<+iX„)<an = 2x<4.jx,- 


(2x„xj),ai + ( 2 x„X 2 ).'a 2 + • • • + (2x® )<an = 2 x„X 4 

where 

= i: (x.t - Mi) (Xi, - Mi). 

lt-1 

But 

(^^XiX /) “ N T i j<r i<T j y 

(2x?) = JV<r| = Nnmci (2) 

where 

r»y is the Pearsoniaii coeflScient of correlation between the and j**' variables, 
ffi , the standard deviation of the variable. 

Substituting the right members of (2) in the normal equations, we obtain 
the system : 

n 

TikCriak t<lk = 0 

fc-i 
n 

23 i<lk = 0 

23 ^•-'1 , k (Ti^lCTk %<lk *= 0 
Jlr-1 


23 ^ idk = 0 

lt-1 


23 ^nk<ru(T idk = 0 
ifc-1 


( 3 ) 
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where 

Let 


<a, = -1. 


I TiiffiCi ■ • • 

A = 


( 4 ) 


Aii be the first minor of the element racriVi in A, nA be A with the and fc*** 
columns interchanged, and be the first minor of the element in the {“* 
column and i*** row of ikA. 

Solving (3) for by Cramer’s rule, we find 


But it can easily be proved that 

inAu = 


hence 


iPtk — ( 1) 


i-k+1 Aik 

Au 


Using cofactors of A instead of minors, we have 




Da 


Djk 

A/ 


Without writing the determinant out in full, we notice that the a's can be 
factored out. Hence 


where 


%Clk = — 


2 2 
(71(72 


2 2 


2 2 
(7j-i(7t(7,'4.i 


•rlKik 


2 2 
<7i<72 


2 2 
i+l 


a* Kii 


(TiKik 

O’kKii 


K = 



• 7*ln 


( 5 ) 


^nl ^nn 
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Using these derived values for the coeflScients, we may write (1) in the sym- 
metric form: 


(Xi - Mi) + Mi) +■■■ + — iX„ - Mn) = 0 , 

<ri a2 (fn 


or 


KijXj 


0 . 


j-i 


For a multiple correlation coefficient, we use the formula 

f"^*V "f" |UA:XA!;'^"| 

I?2 1 J-i L /J 

Hi ^ I — ^ 

N (Ti 


( 6 ) 


which measures the amount of observed dispersion from the regression plane 
in which Xi is the dependent variable. 

Substituting the values for the a^s, we find 


= 1 ~ 


N . 


Knx^i 
H r 




+ 


KinXn , \" 
<^n / 


KhN 


Squaring the bracket expression and using (2), we obtain 

/e ■ == 1 - 


= 1 - 

The second sum is the sum of the products of the elements in the row 
by the cofactors of the elements in the row. This sum is necessarily zero 
unless fc = i; but if fc = this sum is equal to K. 

R' = 1 -:^(KiiK) = 1 - 

Kii 

Oeegon State Agriculture College William J. Eirkham 

School of Science 
Corvallis, Oregon 


_l_ "V V' I AaAii ^ XkjXij 

*L'2 Zj Zj . 1— 

rLii y 

Kii U-i /-i J 
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NOTE ON NUMERICAL EVALUATION OF DOUBLE SERIES' 


1. The Euler-Maclaurin summation formula has been extended to two 
variables by Dr. Sheppard,* and Mr. Irwin,* to determine cubature formulas. 
A more complicated two-dimensional form was given by Baten* involving 
product polynomials, for which a remainder term was also calculated. The 
purpose of this note is to apply the simpler formula to the numerical evaluation 
of double series of positive terms. The method may be extended to multiple 
series of order p > 2. If the double series converges one may sum by rows 
(or columns), using the ordinary sum formula twice. The method is to take 
out a rectangular block of mn terms and then apply the formula to the remaining 
terms. By taking m and n sufficiently large one may cause the series resulting 
from the formula to converge sufficiently rapidly to obtain the sum to the 
desired number of decimal places. For practical work the error may be es- 
timated because of the asymptotic character of the series involved in the Euler- 
Maclaurin formula. 

Write this in the form 


(1) 


S/(.) = + w - TO - - 

f(a) - fis) , r“(a) - /"(«) , , , , 

“1 • • • -f- v — ToITi T *'- 


30240 ^ 1209600 ^ " (2r)! 

If s — 00 one has accordingly in the ordinary ease of convergence 


(2) t /(X) = + i/(«) + + 

00 Too 

Now define v{x) == uix^ y) J u(x, 


y^b 


V) dy + hu{.x, b) - + 


12 


— • • • and wiy) = ^ u{x, y) = J u{x, y)dx + iu(a, y) — + 


y) 

720 


w « a — 1 00 h —1 

• ■ • , then 2 u(x, J/) = y) + 12 v{x) + wiy) 

X—l yl—1 aj—l ymml y—1 


(3) 


vix)dx + ii»(l) - 


+ wiy)dy - \wih) + iw(l) 


__ + . . • + g g uix, y) 


+ 


TO - -u^'d) w"'ih) - w"'il) 


12 


720 


+ 


* Presented to the Society, Nov. 30, 1934. 

* W. F. Sheppard, ‘^Some Quadrature Formulae,*' Proc. London Math. Soc., Vol. xxxii, 
1900. 

* J. O. Irwin, ^‘Tracts for Computers,” No. X, Cambridge Univ. Press, 1923, On Quad- 
rature and Cubature. 

* W. D. Baten, Remainder for the Euler-Maclaurin summation formula in two 
independent variables,” Amer. Journal of Math., Vol. 64, 1932, pp. 265-276. 
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Instead of this one may use ]C 52 w(x, y) + 52 ^(y) + S *^(^)* The scheme 

ipal 1^1 x^l 

of the double series may be illustrated by a sketch of a quadrant of the xy-plane 
in which the point (x, y) represents the term u{x, y). 

Evidently by taking a combination of results from (3) one may evaluate 

quite readily such finite sums as ^ v) where q and t are large. 

As an illustration of (3) consider 2 (®* + uV)”*- Here one needs to 
evaluate the integral of the summand. The transformations x = ay tan 0 
and y — l/t lead to a form which may be integrated by parts. The more 
complicated form 53 23 (a®* + 2ba:y + cy*)"* for the case in which s > 3/2, 
o > 1, might be handled byrnsing x = l/t and approximate integration by 
Simpson’s rule. 

Take as a second example ]C 2 ( 2 : + y) ”, p > 2. The case of p = 4 was 
carried out by taking a = b = 10 in (3) and carrying the computation to twelve 
decimals. The series involved converge rapidly and a result was obtained 
which differed by 2 in the 12th place from the true value 0.119 733 669 448'*'. 

CO 

By summing diagonally one may convert this to the simple series X) 

1 

00 CO 

or X) (s ““ * = 52 (« * The method of summation diagonally may 

2 1 

be extended to X 52 > 2, a > 0, by the applications of the 

Euler-Maclaurin sum formulas (1), (2) in succession after a triangular array 
of terms have been omitted. 

The form X can be written as the product of the single series 

(Zx-'’)(Z !/-«). 

2. Another method of numerical evaluation is the analog of that used for 
single series by the author.® Instead of rectangles one has right prisms of 
square or rectangular cross-section. Instead of shifting the rectangles one unit 
to the right to determine upper and lower bounds the prisms are shifted diago- 
nally so that they go effectively one imit in each variable. In the case of a square 
base each prism is moved along the 45® line one diagonal unit length. For 
the lower bound instead of trapezoids one uses truncated prisms. For example, 
the prism of height Wmn is cut by two planes, one determined by the upper 
vertices Wmn , Wm. , tim+i. n and the other by the upper vertices Wm+i.n , , 

tim+i. n+i of the truncated prism. The surface z = n) passes through 
all the upper corners of the truncated prisms. Each prism is composed of 
two truncated triangular prisms. Now the volume of such a triangular prism is 
the arithmetic mean of its vertical edges multiplied by the area of its base. 


* A New Method for Finding the Numerical Sum of an Infinite Series,’’ Amer. Math. 
Monthly, vol. XL, No.* 9, Nov., 1933, pp. 637-542. 
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Hence the difference in volume between the truncated rectangular prism men- 
tioned above and the prism of uniform hight z Umn can be shown to be 

(4) (5«„n - - 2 m«+i., - 2m«,*+.i)/6. 

Let us consider series whose corresponding surfaces do not rise above these 
truncated prisms. This sort of truncated prism differs less from the volume 
under the surface than the one formed by the diagonal joining the other pair 
of upper vertices and planes through it for upper faces. The lower bound for 
the remainder is the volume under the surface extending to infinity in the 
m and n directions plus the sum of these differences. Accordingly one deter- 
mines as the lower bound for the remainder Rm~i. n-i after summing a rec- 

m—l n— I 

tangular array 2 the form 

y-i 

00 00 m 

“f" 2Wl,n ““ 5‘U»M,n)/6 -j- ^ ^ ^ 'W»,n 

(5) 

^ ' n rco rco rm 

-f- ^ “i" / / llm,nd7n>dTl I I Um , n dtfldTl ^ l,n— 1 • 

j—1 Jl Jm Jn Jl 

The upper bound may likewise be given as follows: 

rzo rv> r * * n 

(6) < S + T + f / Un,,ndmdn - A; 2 -f- 23 

yn— 1 ym— 1 L;->n— 1 J 


where 


« n— 1 

Uii , 

i^m ;«"1 


r = E Zm.7, 


(8) k = 


rzo p 

/ / Wm.n 

yn-l >-1 


1*90 1*90 00 00 

I I Um,nd7TldTl ““ 

Jn Jm j— n— I <»m 


l.n—l *4“ 'W»n,n— 1 


An alternate definition of k is 


Z — J' y^mtudflfldTl ('Wm— l,n— 1 Wfn,n)* 


An illustration is afforded by ]C (^ + 1) * for which k = .45614, 

n— 1 m—l 

¥ = .44536 when m = n — 10 in (8), (9). In this case (6) gave an error of 
— 14 X 10~* and (6) an error of 10“*. 

S and T may be evaluated by the method published in the Monthly.* 

One must assume that fc increases with m and n. It is evident that for this 


• Loc. cit. 
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« 

method and for the one in the Monthly differentiability is not required but 
only integrability, conditions less restrictive than those required by the Euler- 
Maclaurin summation formulas.. It is also clear that the method may be 
extended to multiple series of positive terras of multiplicity greater than two. 

Depabtment of Mathematics Chester C. Camp 

University of Nebraska 
Lincoln, Nebraska 



REPORT OF THE ANNUAL MEETING OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 

The meeting of the Institute of Mathematical Statistics for 1936 was held in 
Chicago on December 28-30 in connection with the meetings of the American 
Statistical Association and the Econometric Society. 

In addition to the sessions at which voluntary papers were read, a session with 
invited papers was held on the morning of December 30. At the invitation of 
the Program Committee, Professor^ P. R. Rider presented a paper on “Recent 
Advances in Mathematical Statistics: Factorial Design^' and Professor Harold 
Hotelling spoke on “The Analysis of Sets of Correlated Variates.^' 

Professor C. C. Craig of the University of Michigan and Professor A. R. Cra- 
thome of the University of Illinois constituted the Program Committee. 

At the business meeting of the Institute, the following officers were elected 
for the year 1937: President, Dr. W. A. Shewhart; Vice-Presidents, Professors 
P. R. Rider and B. H. Camp; Secretary-Treasurer, Professor A. T. Craig. 

The Institute voted that it would presumably hold its 1937 meeting with the 
American Mathematical Society. 

Allen T. Craig, 
Secretary, 


NOTICE TO SUBSCRIBERS 

Plans are under way to include in the Annals a new section, entitled “Numer- 
ical Illustrations of Statistical Methodology.^’ This new’ section will be a 
regular feature of the Annals, and will deal with the application of statistical 
technique and theory to the solution of problems in various fields. It is hoped 
that this new section will be of considerable value to those who are primarily 
interested in numerical applications of the more recent theoretical developments 
in mathematical statistics. 

The Editor will welcome contributions to this new section of the Annals. 
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REGSESSION AND CORRELATION EVALUATED BY 
A METHOD OF PARTIAL SUMS 

Bt FbLIX BEBNSTBm 

be sure, Laplace viewed the matter in a similar way but he selected the 
absolute value of the error as a measure of loss. But if we mistake not, this 
position is certainly not less arbitrary than our own; that is to say, whether the 
double error is to be considered just as tolerable as, or worse than, the simple 
error twice repeated and whether it is thus more fitting to ascribe to the double 
error only a double weight, or a greater one, is a question which is neither in 
itself clear nor determinable by mathematical proof but has to be left entirely 
to individual discretion. 

'^Furthermore, it cannot be denied that the assumption under discussion 
violates the principle of continuity and precisely for this reason the procedure 
based on it strongly defies analytic treatment while the results to which our 
principle leads have the advantage of simplicity as well as of generality.” — 

F, G. Gauss: Theoria comhinationis observaiionum^ pars prior ^ art, 6, 

Since the ^‘Theoria Combinationis'' of C. F. Gauss appeared in the year 1821 
a century of Mathematical Statistics has been dominated by the ideas of this 
classical treatise — ideas whose fertility does not seem to be exhausted even 
today. 

The germ of most modern contributions to mathematical statistics — in fact 
also those of Karl Pearson and his school — go back decidedly to this paper. 
Though the immediate achievements of Gauss are so conspicuous as not to 
need any comment, a true critical appreciation of the work can be gained only 
by comparing it with the previous methods of Laplace, superseded by those of 
Gauss. 

For such critical appreciation, C. F. Gauss himself has prepared the ground 
in the lines quoted at the beginning of this article. To Gauss the standard 
deviation is a measure of uncertainty or risk of a game in which the errors of 
observation are considered as causing only losses. In this he follows the lead 
of his great predecessor. The difference between them is that Gauss adopts 
the square of the error as a measure of the loss while Laplace adopts its absolute 
value for this purpose. Either choice frees the error from its sign so that the 
loss is the same regardless of the sign of the error. 

Gauss considers this choice of the measure of the loss as purely conventional. 
Therefore he feels justified in adopting the square of the error because in adopt- 
ing the square instead of the absolute value of the error, the mathematics he 
uses remains in the easily accessible domain of analytical processes. This 
creates for these methods a superiority in elegance, simplicity, and generality. 

The modem developments of mathematical statistics, based on the principles 
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.18 

6f Gauas, have confirmed the correctness of this viewpoint. This has proved 
true ptarticularly in the theory of analysis of variance developed by R. A. Fisher 
(md in the more general theory of semi-invariants, first defined by N. H. Thiele. 

The inadequacy of the Gaussian method seriously impairing its value for 
statistical use has come to light through the investigations of Karl Pearson of 
distributions of one and two variables. Since the moments of higher order 
involve standard deviations of increasing magnitude the characterisation of the 
distributions by means of the moments, in line with the Gauss-Thiele concepts, 
becomes practically impossible. Therefore it was of the greatest interest that 
Lindeberg was able to derive an expression for the standard deviation of a 
measure of skewness constructed not on Gaussian but on Laplacian lines, 
namely based exclusively upon the sign of the error. The mathematical diffi- 
culties surmounted by Lindeberg by a very involved and difficult analysis— 
with some clearly indicated gaps in the proofs — are precisely of the character 
of those that Gauss wished to avoid. Encouraged by the success of Lindeberg, 
I have developed in two papers’ the standard deviations of more general mo- 
ments and the correlations between them of which the mean deviation of Laplace 
and lindeberg’s measure of skewness are special cases. The proofs have been 
arrived at by a rather simple and rigorous procedure. These new moments, 
together with the old ones, form a new system of statistical characteristics by 
which a distribution in one or two variables can be described by expressions 
of lower order and therefore of greater precision. This method makes un- 
necessary the use of moments of higher order than the third. 

But another point of interest is still involved. It has been assumed that the 
Gaussian characteristics give a greater amount of information than those of 
Laplace. This is proved, however, only for the case of the normal distribution 

e“*’**. This was recognized by Gauss himself in his paper of April, 1816, 

that appeared five years earlier than the Theoria Combinationis Observationum. 
In article 6 of his paper, he says, that the constant A of a normal distribution 
obtained from one hundred observations by the use of the standard error is 
as exact as that obtained from one hundred fourteen observations in which 
the mean deviation is used. Hence with a given number of observations only 
the equivalent of 88% of the total are used by the second method. This does 
not hold true for all distributions. The following theorem can easily be proved : 
The amount of information as defined above, furnished by the use of the mean 
deviation is greater, equal to, or less than that furnished by the standard devi- 
ation, depending respectively upon whether 

‘ Felix Bernstein: "Die mittleren Fehlerquadrate und Korrelationen der Potenzmo- 
mente und ihre Anwendung auf Funktionen der Potenzmomente,’’ Matron, Vol. X, N. 3. 
(Nov. 1932). 

Felix Bernstein: “Uber den mittleren Fehler der Potenzmomente." Zeitschr. f. d. ges. 
Vers.-Wissenschaft, Bard 30, Heft 3, March 1030. 
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(ft - D'l 40So - 1) 

where 


ft 


w 


ft = ^ 

fit 


lik the A:-th moment and 


For example, in the distribution ^ e 

JU 


the mean deviation. 

, the mean deviaition furnishes a greater 

amount of information than the standard deviation.* 

In the present paper, we shall discuss the practical use of expressions for 
correlation and regression in which the new type of statistics formed along 
Laplacian lines will be used. These new expressions are of a linear form and 
can be computed therefore more easily than those of Karl Pearson. The amount 
of information given by these expressions is less than that given by the expres- 
sions of Pearson if the normal law, in two variables, is fulfilled. For other 
distributions, however, this is not generally true. The determination of the 
standard deviations of these new expressions is given in Metron.* 

The application of the new expressions of regression and correlation to grouped 
data is set forth here for the first time. The method is strongly recommended 
for all cases in which the data lose reliability with increasing deviations from 
the mean. Deviations in the new method enter the expressions only in the 
first degree and not in the second as in the case of Pearson's. It is obvious 
that the influence of the doubtful extreme readings is, therefore, considerably 
lessened. Since our expressions are linear, no adjustments for grouping (Shep- 
pard's corrections) are necessary. 

It ought to be mentioned here that linear expressions for the measurement 
of correlation have been set up before. 

K. Pearson (Biometrika) and Egon Pearson (Biometrika) have derived an 
expression called '^linear correlation ratio" which in case of linear regression is 
identical with the correlation coeflBicient. 

K. Pearson also discusses the linear correlation coeflBcient 


-K 


xsgx 


xsgy \ 

ymr 


* To this second type of distribution curves also belongs y » 
of two Gaussian curves with the same origin, i.e. ^(x) » } 
1.6 < ifc < 3.4. 


^(x) where x(x) is the mean 
/ 




I owe this remark and some other valuable suggestions regarding the subject of this 
paper to Mr. Myron Fuchs. 

* Op. cit. 
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suggested by Lena and various other linear expressions, all similar to our expres- 
rion (1). He finds that they are. all equal to 1^ quadratic correlation coefficient 
in the case of a Gausrian distribution. 

However, their expressions were not recommended by those authors for the 
determination of correlation between quantitative variables, because — 

1. No easy and practicable methods were given for their evaluation in the 
case of grouped data. 

2. Their standard deviations were not determined. 

We now proceed to define the new formulas and to describe the methods for 
their evaluation. The proofs are furnished in the Appendix to this paper. 


Let ri and ri denote the regression coefficients of x on y and y on x respectively, 
and r, as usual, the coefficient of correlation, and by x and ^ the arithmetic 
means of the x’s and y’s. Let us take £, p as the origin, so that x, y are the 
deviations from the mean. We have 


ri 

( 1 ) 

r» 


r = 


Sx 

+y 

Sy 

+y 

Sy 

+x 

Sx 

+x 


or n 


or fj 


Sx 

-y 


Sy 

-y 

Sy 

— X 

Sx 

— X 


Sx denotes a partial sum of the x’s, this sum being extended over all the x’s 

+J/ 

of the observations whose y is positive and the other sums have a corresponding 
meaning. 

It should be noted though that if data occur whose y-deviation is 0 (practically 
never in a grouped table) one-half of the sum of these x‘s should be added to Sx. 

+ 2 / 

In the S a similar addition should be made in case observations occur in which x 
■fx 

is zero. (See Table IV.) 

The formulas (1) and all following ones will be proved in the appendix to this 
article.^ 


* Using fj and r* of (1) the regression lines are y * T 2 X and x « riy. They are those 
straight lines which fit the data best according to the method of least squares, if the weight 
of the deviations is taken inversely proportional to the absolute value of the variable. 
Taking x for instance as the independent variable, rg is the value of m which minimizes 


^ y (y ^ extended over all data x y). 
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The standard deviations of n and r* are 


, (1 + m(m — 2r)) where m = 


rli = ^ (1 + n(n - 2r)) where n = 


We are now going to illustrate the computation of r and for this purpose 
we shall use a table of Pearson^s which gives the correlation between the heights 
of fathers arid daughters. ^ 

The totals at the right and lower end of the table are first computed and 
the bracketed numbers are the sums of the numbere that precede. The 


means are 


. 1659.5 - 1179 480.5 

^ 1376 “ 


. 1650.9 - 1390 260.5 

^ ~ ~ 1376 ~ ■^1376 

whose signs determine on which side of the working mean to ‘‘quarter^^ the 
table. This quartering is done in Table 1 by the lines w and hh. Then the 
totals above the heavy horizontal separating line hh and those to the left of 
the vertical separating line w are found, e.g. 2, 4.5, 7.25, • • • and .5, .5, 0, • • • . 
Multiplying these totals by the respective class marks, wc find the outside lines : 
18, 36, 50.75, • * • and 5.5, 5, 0, • • • . 

Sx is now = 1107.5 — 420.5 = 687, and an adjustment for the fact that a 
-y 

working mean has been used has yet to be made. This adjustment is xN^ 
where is the number of negative y’s. {N^ = 728.) 

We have therefore for the adjusted values 


= 1107.5 - 420.5 + 


•728 = 825.07 


= 1179 + 11^-728 = 1433.21 


ri = .5757 


r* = .5170 


r = .546 


The standard deviations, according to the formulas (2) are 

(Tfj ~ .031 (Trj ^ .027 



Correlalion between Heights of Fathers and Daughters 
X -* Height of Fathers y i Height of Daughters 
In Inohee 



Working Mean z = 67.5 
Claaa width 1 Inch 
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The standard deviation of r* = n X r* has to be estimated by using the 
general formula for the standard deviation of the product c of two variables 
a and h', 



o* 6* 


2Raa<rh 

ah 


R being the correlation coefficient between o and h. Since — 1 < /? < + 1 , 
substitution of these limits for R leads to the inequalities 



putting a = ri , 6 = rs , c = r* we have 


Ti r Ti rj 

Considering the relation <Tr = 

we have 2r r2 — art ^0 < ar < 2r (< 1 ^ ra + art ^i) 
from which we derive with sufficient approximation 


ar < 030 


A slightly different arrangement for computing r has been made in the 
following table. 


TABLE II 


Correlation between diameter of the stem and length of the lonest flower petal of 

Trientalis europaea* 



PS 

3 

15 

34 

45 

30 

G 

2 

0 

0 

0 

0 


PS 


-4 

-3 

-2 

-1 

0 

1 

2 

3 

4 

5 

6 

Total 

1 

-4 

1 











1 

7 

-3 

1 

4 

1 

1 








7 

29 

-2 

1 

9 

16 

3 

1 







30 

33 

-1 


2 

9 

22 

9 

2 

1 





45 

27 

0 



8 

19 

20 

4 

1 





52 

8 

1 

1 



7 

18 

12 

6 

4 




48 

1 

2 




1 

8 

9 

3 

2 

1 



24 


3 






3 

6 

4 

1 



14 


4 







2 

2 

1 

2 


7 


5 









1 

3 


4 


6 









1 


1 

2 

Total 

4 

15 

34 

53 

56 

30 

19 

12 

5 

5 

1 

234 


*E. Czuber: Die statistischen Forschungsmethoden, Wien, 1921. 
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TABLE III 

X B Diameter of the stem. 

y = Length Qf the longest flower petal in millimeters. 
Working mean, x« = .825, = 34.5. 

Class width of » - .4 mm. of y = 6 mm. 


X 

Total 
times X 

P.S. 
times X 

V 

Total 
times y 

P.S. 
times y 

-4 

16 

12 

-4 

4 

4 

-3 

45 

45 

-3 

21 

21 

-2 

68 

68 

-2 

60 

58 

-1 

53 

45 

-1 

45 

33 

0 

(182) 

(170) 

0 

(130) 

(116) 

1 

30 

6 

1 

48 

8 

2 

38 

4 

2 

48 

2 

3 

36 

0 

3 

42 


4 

20 

0 

4 

28 


5 

25 

0 

5 

20 


6 

6 

0 

6 

12 


Mean 

(155) 

-27 

(10) 


(198) 

+68 

(10) 


The P.S. columns are the partial sums as explained in the previous table. 
The work of multiplying the totals by the class marks and of adding them has 
been separated here from the table. 

We obtain N = 234, N-, = 106, = 135 


n = 


97 

170 - 10 - ^4 X 135 
130 + ^ X 135 


n = 


116 - 10 + ~ X 106 
182 - ^ X 106 


.805 


.834 


r = .82 


Pearson's coeflScient for this table is r = .83. 

Finally we illustrate by a small non-grouped table where the partial sums 
can be written down immediately. 
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TABLE IV 

Correlation between Agee of Husband and Wife 


Age of 

Age of 

Deviation 

Deviation 

Husband 

Wife 

Husband 

Wife 

22 

18 

-8 

-8 

24 

20 

-6 

-6 

26 

20 

-4 

-6 

26 

24 

-4 

-2 

27 

22 

-3 

-4 

27 

24 

-3 

-2 

28 

27 

-2 

-1-1 

28 

24 

-2 

-2 

29 

21 

-1 

-5 

30 

25 

0 

-1 

30 

29 

0 

+3 

30 

32 

0 

+3 

31 

27 

-1-1 

+ 1 

32 

27 

-1-2 

-1-1 

33 

30 

-1-3 

+i 

34 

27 

-f4 

+ 1 

35 

30 

-1-6 

-1-4 

35 

31 

+5 

+5 

36 

30 



37 

32 

+7 

+6 

Ave 30 

26 




Here 0-deviations occur in the third column. Hence* 

iSj/ = 26 + i X 8 = 30, Sx = 33, Sx = 31, Sy = 36, 
+x +x +y +y 

ri — .86, r* = .91, r = .88 (Peapson’s r = .86) 

Appendix 

Proof of formula (1), page 1. The following notations will be used: 
(/(x))“ = probable value of /(x) 

(f(.y))* = probable value of f(y) for a fixed x. 

sgx — sign of x = — . for x 0. sgx = 0 if x ^ 0. 

I®l _i 


* See page 7. 






The sifisviittptioii of linear regression means that 

(4) - r^.(x - *•) 

We multiply both sides of (4) by some arbitrary function ^(x) of x and get 
ivl - y')4>(x) = r».,(x - x“)0(x). 

Both sides are functions of x. We shall take their probable values for all x’s. 

Now, for a fixed x, yl^(x) * (y^(x))2 and the probable value of (y^(x))2 for 
all x’s is equal to the total probable value (y^(x))®. So we have 

(y0(«))" - (yV(*))® = r,:,((x - x®)<^(x))® 

(6) = ((y - 


If now we take x®y® as the origin, we get 


Ttfix — 7 ~ 


(x0(x))® 


and similarly 


Txiy = 




(y<t>iiy)y 

where is another arbitrary function. 

Replacing the probable values by the respective arithmetic means we get 


Sy<t>ix) 

Sx^(x) 


Sx<l>i{y) 

Sy<l>iiy) 


with 2,y as the origin. 

By a suitable choice of the still arbitrary functions 0 and , we may derive 
all the various expressions for regression coefficients. Taking, for instance, 
<l>{x) = X, ^i(y) = y, we get Pearson's expressions. Taking 0(x) = 8g(x — ai), 
= 8g(y — a 2 ), ai and a 2 being constants, we have 

/-v _ Sy agjx - «i) _ Sx sgjy - at) 

Sx 8g(x - ai) ’ Sy 8g(y - as) 

and if we make ai = orj = 0 

fa\ « Sysgx ^ _ Sxagy 

Voy Ty:x *01 f Txiy — -g; 

Sxsgx' Sysgy 

Since Sx Sy ^ 0, we can add Sy or Sx to the numerators and denominators. 
Adding Sy to the numerator, Sx to the denominator and multiplying both 
sides of the fraction by J we get 

/Q-j _ _ kSyjsgjx - at) 4- 1) 

^ ’ ■'=* ■ hSxisgix - ax) + 1 ) 
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Instead of (9) we can write 


( 10 ) 


Tyix — 


S y+ hS y 

a: > ai a; = oi 

)S « + a; 
i > ai * = ai 


since the operations of (9) multiply the y ordinates by 0, 1 according as the 

x’s are ^ ai . 

The expression (10), with a suitable choice of ai should be used for the purpose 
of numerical calculation of r. For instance, when calculating r from the data 
of Table IV, we took ai = «« = 0 and had 

Sy-j-i S y 
+x x = 0 
Sx 
+x 

When dealing with data which are arranged in a grouped table (Tables I 
and II) we take ai equal to the x-ordinate of that classline which is nearest to 

the mean. ^In Table I ai = .5 — With that choice of ai the sums 

S disappear and the sums S are equivalent to the corresponding sums 




X = ai 

S. Hence we have 
+x 


( 11 ) 


f*V:x — 


Sy 


x> ai 


and similarly 


Sx 
+x 

Instead of (9) we can also write 

iSyisgix - ai) - 1) 


Sx 

-\-y 

Sy 

+y 


(9a) 

This leads to 

(11a) 


r«;x — 


iSx{8g{x - ai) - 1) 


= 


Sy 

— X 

~Sx 
— x 


and 




Sx 

-y 


Sy 

-y 


^ It is desirable to chose the absolute values of the a’s small so that the maximum number 
of data enter into the calculation of r. However, to take ai = ai =» 0 would necessitate a 
division of the middle arrays of a grouped table, a laborious process. Hence the choice 
of the a’ 8 as described above. 
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Proof of the standard deviations of Formula (2). 

In my article on standard deviations and correlations of moments^ the stand- 
ard deviations of the expressions used in this article have been derived. 

In the following, the notation of the Metron article just referred to will be 
used. We use the ssunbols: 

P«.» = Er”*y" 

P/m.n. = x’^sgxy” 

Pm./n = 2^ x’"y”sgy 

P /m,/fi * ^ 

The summations indicated extend over all observations. The true or prob- 
able values of the same expressions are indicated by using p instead of P. 

- r — 

^XIV — — n 

^0/1 

We derive the standard deviations by defining the deviations as first variations. 

log n = log Pi/o - log Po/i 




The probable values of the terms on the right hand side of the last equation are 
derived on pages 17-19 and listed on pages 32-33 of the Metron article referred 
to. The proofs which imply essentially a process of variation of Stieltje’s 
integrals will not be given here. From pages 32-33 we take 


[(«Pi/o)Y = 


IW.,,)’]* . 


SO that 


[(Pi/oSPo/i]"* = 


- Pl/oPo/l 

N 


- >')■[ 


P20 Vm 

* T" 2 

Pl/D Po /1 


Pi/opo/i. 


Assuming Gaussian distribution, we can put 


■K t 

P*o = 2 


X 2 

P02 = 2^®/! 


Pu = r\/ pnPto — »’^P/ioPo/i 


^ Felix Bernstein: ‘‘Die mittleren Fehlerquadrate und Korrelationen der Potenzmo- 
mente und ihre Anwendung auf Funktionen der Potenzmomente/^ Metron, Vol. X, N. 3 
Nov. 193?). 
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Hence 

( 15 ) 

•^2 \ Pi ,0 Pi/o/ 

Replacing the theoretical values by their corresponding empirical values, 
we have 

(16) ffj, = ^ (1 + »i* — 2m) where m = 

* 2J\r Sxagy 

The formula for has been derived here for the value of n as given by (8) 

i.e. n = In fact, we used n = f- — ^ in the examples in the 

Sysgy Sy sg {y - a) 

article, and a had some value absolutely smaller than .5. To use equation (16) 

for the standard deviation of n is within the limits of the required degree of 

accuracy; hence we shall disregard the difference. In a later paper the standard 

deviation of ri for any a will be derived by using the method described in the 

Metron article, for a different purpose. 

To prove the statement in the footnote to page 7 

To find the value of r* that makes 

Sf(x) (y — rjx)* a minimum. 

By differentiating we get 

Sf{x){y — rix)x = 0 

_ 'S^a/((g)y 
* Sxf(x)x 

If /(*) = 1 we get Pearson’s coefficient. 

If f{x) = .—Ax 0) we get 
FI 

Sysgx 
Sx sgx 
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METHODS OF OBTAINING PROBABILITY DISTRIBUTIONS' 

By Bttrton H. Camp 


The emphasis of this paper will be on method. Special results will be cited 
in order to illustrate the methods rather than to summarize achievement in the 
field; for that has been done already by Rider (1930, 1935) Irwin (1935) and 
Shewhart (1933) in recent surveys. The purpose is to describe and to illustrate 
most of the methods that have been used to determine exact probability dis- 
tributions, and to show that they are all derivable from one fundamental theorem. 
In order to prove this unity in a simple manner, it will be desirable to omit from 
consideration methods which are essentially ingenious forms of coimting, such 
as are used in sampling without replacements from finite universes, and in 
finding the sampling distribution of a percentile. 

The general problem to be discussed may be stated as follows: N individuals 
(tif * • • , <jyr) are drawn, one at a time with replacements, from a universe whose 
probability distribution is <^(0- A certain single valued function of the Vfi is 
formed. This is called a parameter of the sample, and is frequently also, 
but not necessarily, a useful estimate of the corresponding parameter of the 
universe. The problem is to find its probability distribution, /(x). As usual, 
a probability distribution is a function which is required to be defined, except 
I^rhaps at a set of measure zero, throughout the infinite domain of its variables; 
it is nowhere negative, and its integral over its domain is unity. 

Most of the more recent developments of the theory relate to a more general 
form of this problem. Instead of N individuals, there are N sets of n individuals 
in each set, and these sets are drawn respectively from M{M g N) universes, 
each of which is described by a function of n independent variables, thus: 

( 1 ) •••,<„);(»•= 1 , 


Instead of a single parameter there are P parameters, and each is a single valued 
function of the observed values of the nN individuals in the sample, thus: 

(2) ■■■, • • • ; ■■■, (i = 1, • • • , P) 

The first method to be described is fundamental and will be designated as 
Theorem I. Let it be required that each g as described in (2) be not only- 
single valued but also constant at most in a set of measure zero in the nN-way 
space of the <’s. Then 

(I) /(*!,••• ... ,Od2’ 


^ Presented to the American Mathematical Society at a meeting devoted to expository 
papers on the theory of statistics, April 11, 1936. 
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where X is the space of x'b and T the space of the t’s, p is any measurable set 
ol points in X, and q is the set in T for which gr is in p. Often p is the P dimen- 
sional cube (x< + A i * 1, • • • P) at the point (xi, • • • , x,) and then 9 is 
the set where 


(3) Xi ^ g x< 4 - A*; (i = 1 , • • * , P) 

and <l> is the simultaneous distribution of the sets of t’a, 

( 4 ) 












In this is the universe from which the set of t’s is drawn. Obviously, 
itN > M, some of the 0 ^’’’s are identical, and then it is assumed that the several 
sets are drawn independently. Often, all of the N sets of t’s are drawn from 
the same universe. Then Jkf = 1 and all these <I>’b are identical, and (4) becomes 

In the special case where there is but one parameter (P = 1) and but one 
individual in the sample (« = iV = 1 ), and p is an interval, formula (I) becomes 


(la) 


/ x+Ax r 

fix) dx — I 4>dt; 

and in the very special case where it is also true that q is an interv’al it becomes 
(Ib) fix) = «(t) g 


provided also that certain derivatives (to be specified later in the proof) exist, 
where t is now the inverse solution of the equation, 

(5) X = g{t)x 


The proof of formula (I) is immediate, if one is willing to assume the existence 
of the probability distribution / ; for then the left side is by definition the prob- 
ability that the x^s lie in p, and this is also the meaning of the right side of (I). 
(la) can be proved without assuming initially the existence of /(x), for then 
the existence of /(x) can be inferred from the existence of the right side of (la), 
because /(x) may be set equal (except perhaps at a set of measure zero) to the 
upper right hand derivative, with respect to Ax (Ax is a variable, and x is fixed), 




0 dtf provided that one adds the condition that this derivative is nowhere 


infinite. The point at issue here is merely the existence of a primative for a 
monotone increasing function of Ax. (Ib) may be derived from (la) by taking 
the derivative of both sides with respect to Ax, if the derivatives are continuous. 

Theorem I, in these various forms is used a great deal, especially in the last 
form (Ib). This affords one freedom to choose the most desirable function 
for purposes of tabulation. R. Av Fischer ^s z distribution, a logarithm, is an 
important illustration. Many authors have been interested in so choosing the 
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function that its distribution shall be normal. They include several of the 
older writers, and more recently H. L. Rietz (1921, 1927), and G. A. Baker 
(1932, 1934). However, the theorem is of special importance in the theory, 
for all the other principal methods of obtaining probability distributions are 
essentially corollaries of it. These corollaries will be called Theorems II, III, 
and IV. 

Theobsu II. Let p (the measure of p) and g (the measure of g) be infini- 
tesimals of the same order and let both the oscillation of /(i.e. maximum /- 
minimum /) in p and the oscillation of ^ in g be infinitesimals; then (I) may be 
written, 

(11) fp = <l>q, 

where / applies to any point of p and 4> to the corresponding point of q. This 

equation (II) is an approximate equation in the sense that differences of higher 
order than those retained are neglected. In particular, with the conditions 
used in formula (la), equation II becomes 


fAx = <l>q. 


The left side of (II) is an approximation to the probability sought. The right 
side shows that, in order to evaluate it, one need only find the volume in T space 
of the differential element q and multiply it by the value of ^ in q. Formula (II) 
expresses the so-called geometrical method used by many authors, e.gr., by 
R. A. Fisher (1915, 1925), by Wishart (1928), and by Hotelling (1925, 1927). 
The chief difficulty in connection with it is in finding the volume of niV-dimen- 
sional q. In order to display the advantages and disadvantages of this method 
we shall pause at this point and look at a concrete example.^ 

Let two individual* (h , fe) be drawn independently from a normal universe 
and consider the simultaneous distribution /(x, y) of the sum, x = , 

and product, y = hUj the mean of the universe being chosen as the origin. 
Here iV = 2, n = 1, ilf = 1, and so, 


( 6 ) 



^ 

2«r2 




1 

27ror^ 


-*^5 (*’-**«') 

e 


The point set q is the area lying between the two adjacent hyperbolae, 

titi = 2/, ^ A = 2/ + Ayy 

and also between the two adjacent lines, 

ti t 2 ^ Xj “H = X “4* Ax, 

whem Ax and Ay are infinitesimals and are equal. This area may be computed 
by simple integration and is: 


* See also C. C. Craig (1936). Craig uses another method to be explained later (formula 
Ilia). 
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2Ax Ay 

^ -s/s? — 4y 

» 0 

Hence 11 gives us immediately the desired result: 

ag^--2y 

1 1 

fix, y) LxLy = — c ■■■ .. ■ • Aa:Ay, 

v<r 0^ — 4 ^/ 

= 0 if X* < 4^/. 


if X* > 4j/, 

if X* < 42/, 

if X* > 42/, 


If X* = 4y, § is an infinitesimal of lower order than ^ = (Ax)*, and so Theorem II 
does not apply. In this case we must go back to Theorem I, and from that we 
can learn that the probability, 

y Jdxdy, 


is an infinitesimal of the first order if p = Ax Ay = (Ax)* is of the second order. 
Hence it cannot be approximately represented by a finite number times p. 
The oscillation of / in p is infinite. The form of the surface /(x, y) is interesting. 
The ordinates rise to infinity on the contour of the parabola x* 42/, and vanish 
within it. The surface is symmetrical with respect to the plane x = 0, but 
not with respect to the plane 2 / = 0, However, it is clear that the total prob- 
ability of any given product, y {i.e. the probability of this y for all possible 
values of x), is the same as the total probability ol —y \ hence 



/ f{x,-y)dx, 

y-Qo 


and the corresponding formulae, 


2 

- e 


V 

■a 



2r3 


\/x* — 4j/ 


- dx 


and 





■y/®* — 4y 


dx 


iy > 0), 


iy < 0), 


must be equal; both may be reduced to the single form 




if y 9 ^ 0. 


This is the probability distribution of y. 

With this example before us, let us now reconsider the theory: 

(t) The requirement (in II) that the oscillation of <t> be infinitesimal in q 
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be satisfied if one can show that ^ may be expressed as a continuous function 
of the parameters ( 21 , 2 : 1 , • • • , Xf). In our example these parameters were 
X and y and ^ was so expressible (6). But if we had tried initially to find by 
means of (II) the distribution of the product y, independently of what values 
X might have, we should have been stopped at this point, because ^ is not 
expressible in terms of y alone. We should also have been stopped by the 
requirement that q be infinitesimal of order Ay, for q would have been the 
space between two hsrperbolas and its area for any fixed (Ay > 0) would have 
been infinite. But, when thus stopped at that first point, it would have been 
clearly indicated to us that the distribution of y might have been found via 
the detour of finding the simultaneous distribution of both x and y, because 
an attempt to express ^ in terms of y would have led to the pven expression in 
terms of both x and y. For a similar reason R. A. Fisher (1925) was able to 
find the distribution of the variance by finding first the simultaneous distribution 
of the variance and the mean. Also, he was thus able to find the distribution 
of the coefficient of correlation by finding first the simultaneous distribution of 
all the first and second order moments. 

(ti) A distinct advantage of this method is that q is independent of the 
universe so that once found it may be used in connection with any universe 
which satisfies the condition that it can be expressed as a continuous function 
of the parameters. Thus, the distribution of the sum and product in our 
example may equally well be found for the universe described by the Type III 
curve, > 0). For, then 

^ = A* <1 <* = A* y e"“, 

and so, using one-half of the same q as before, since now .r, y ^ 0, 


Six, y) = A* y 


= 0 


\/x* — 4y 


From this, F(y) can be found by integration (c./. Kullbach, 1934) 


F(y) = AV 


r ~ 


\/x^ - 4y 




/•no ^ 

[ « 

Jo u 


if 

if 

du. 


X > 42 /, 

< 4y. 


As another illustration, consider a normal universe of n intercorrelated vari- 
ables in which all the total intercorrelations are equal to r (e,gr., the statures of 
n brothers) and let the sample be a single group of n (one individual for each 
variable). 

* ~ (2t)"« R ® 

where R = (1 — r)""“[l — (n — l)r], fci = (1 — r)""®!! — (n — 2)r], and 
ft* = — r(l — r)“~*. Suppose one wishes to find the simultaneous distribution 
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of the variance x and the mean p for such samples.' Since for Student’s problem 
Fi^er has found the value of g for this x and 2 / to be 

n~8 

q ^ cx ^ AxAy, 

their distribution /(x, y) for this universe may be written down immediately. 
In terms of x and y the bracket in the exponent of 4> is y^(Jcin — k^ri + k 2 n^) 
-f xn{ki — h), and so/(x, y) is the product of q and this form of 4>: 

/(x, y) ^ Ke^ x"^"', ^ ==5 _ _ [{kin - fen + - n(fe - fe)x]. 

(m) Another attribute of this method is that it sometimes lends itself to easy 
extensions from a simple case where there is only one restriction (iV — 1 degrees 
of freedom) to similar cases when there are more restrictions. Thus R. A. 
Fisher (1924) proceeded from the variance of a sample from a single universe 
to the variance from a set of universes, as required in the theory of analysis of 
variance; and thus also (1915) he had proceeded from the distribution of r to 
that of multiple 72; and Hotelling (1927) showed how these distributions could 
be obtained when the values of each variate were themselves intercorrelated 
(as in a time series) and not merely correlated with values of the other variates. 

Theorem III. Now let us consider again the fundamental form (I). For 
convenience let nN = m. If the conditions will not permit us to write the right 
side in the form in (II), it is still possible that we may be able to find that 
(m + l)-dimensional volume by some other method. In particular, whenever 
it is possible to iterate the integral once we have the formula: 

(III) JjdX = dT' 

where Qm is the section of q by space at the point (^i , • • * , ^m-i) of T* space, 
T' space being the space of the (<i , • • • , 2m-i) coordinates. With added condi- 
tions one may deduce from (HI), for the case where there is but a single para- 
meter X, the approximate equation : 

(Ilia) fdx==dx f dr • , U 

i Jt' dx 

in which im, is supposed to have been expressed in terms of the other coordinates 
by solving the equation x = ^(^i , • • • , ^m). It is an approximate equation in 
the same sense as (II) was. Sufficient conditions for this change in the left 
side of (III) have already been mentioned in discussing (II). The propriety 
of making the corresponding change in the right hand side may be left for 
determination when the form of is given. It will perhaps be sufficient here 
to point out that our earlier example illustrates both the case where this change 



• A special case of a more general problem solved first by R. A. Fisher. 
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is pemisdble aad ttdiere it is not. For, let it be required to find the distribution 
f(jf) of the product y = Uh without reference to the sum, h + t%- FtHinula 
(III) yields 


(7) 



dy - 2 


/ • /•<»■ 
dh / 

Jvli\ 


(»4-Av)/<i 




This is valid for every value of y including y = 0. If y 0, we may change 
the right hand side as in (Ilia) and obtain as the probability that y is in the 
interval (y, y + Ay); 


( 8 ) 





dti + «, 


where c is a differential of higher order than Ay. This may be proved by com- 
puting the difference between the value of (7) when U has constantly the value 
(y + Ay)/fi and when it has constantly the value y/<i. If y = 0 this change 
in the right side of (7) is not valid; it is easily seen that in this case the integral 
on the right of (8) is infinite. It may be shown, however, in this case that 


sAtf 



a^d that this is an infinitesimal, and that it is of order as small as one. 

Many authors think of (Ilia) as the fundamental formula in the theory of 
probability distributions. One of the simplest and earliest applications of it 
was to establish the so-called reproductive property of the normal law: that 
the sum of two variates is distributed normally if each is distributed normally. 
Jackson (1936) has used it to establish a similar property for two Type III 
distributions which have the same exponent of e. Usually this integral is 
difiScult to evaluate when N > 2 because of the unsymmetrical form into 
which it is cast, but when N = 2 and there is but one parameter (Ilia) it is 
perhaps the most convenient of all the formulae. 

Theorem IV. An exceedingly useful formula is obtainable from (I) in the 
following manner. Let ^(xi, • • • , xpJ ai, • • • , oq) be a finite single valued 
function of the old parameters (x) and of some new parameters (a). Subject 
to general conditions to be stated we may write: 

(IV) jjfdX = j^B'<t>dT, 

an identity with respect to each a, where B' is the result of substituting (2) 
for the x’s in B. 

Since this theorem has not been proved in this general form, an outline of 
the proof will be given. Sufficient conditions are: 

(o) All the integrals involved shall exist. 
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(b) If p is limited (in the sense that it lies within a finite hypersphere), so 
is q, and conversely. 

Proof, Let Zo be a limited p set and To the corresponding q set such that 
both (c) and (d) hold (e > 0) : 


(c) 

[ fedx - 

f f$dX < €, 


Jx, 

> 

id) 

f 4>0'dT - 
Jr, 

j >1)9' dT < e. 


It is easy to see that such an Xo and a corresponding To do exist, as follows: 
Let Xj be a limited set for which (c) is true, and for which it will remain 
true no matter what points are added to Xo. Similarly, let Tq he sl limited 
set for which (d) is true and for which it will remain true, no matter what 
points are added to To . Presumably Xo and To do not correspond to each 
other, but we may now let Xo be the totality of all the points of Xo and of all 
those points of X corresponding to To, and let To be the totality of all the 
points of To and of all those points of T corresponding to Xo , Then Xo and 
To do correspond to each other and have the desired properties (c) and (d). 
Now, since 6 is finite, it is limited in Xo. Let 

(e) 1^1 <ffinXo. 

Divide the interval (—Hj H) into s equal subintervals of length fe, thus defining 
in Xo according to Lebesgue the measurable sets, 
p, (i = 1, • • • , $), and corresponding g,- sets in To : 

I Os 6 ^ h in pi , 

Os 6' ^ h in g, . 

Choose arbitrarily any point of pi and let be the corresponding value of B. 
Then let 


6 = fci in p* (i = 1, • • • , s), and similarly let 
S' = fct in g* (i = 1, • • • , s). 


Then 


and 

f efdx = Z*. f fdx, 

Jxo i Jpi 

Since by (I) 

I dT « I 4> dt. 

Jto i jQi 

t r 


/ JdX = / 4 ,dT, 

Jpi JQi 

iff) 

1 9 fdX= 1 0 '<t>dT. 

Jx, JT, 







and 


0 


9)fdX f l9-elfdX^h f fdX, 
Jxt Jxt 

jT 0' - O') dX g A ^ 


gh 4>dX. 

jTo 

So, Bs h approaches zero both sides of (g) approach limits and their limits are 
equal: 

/ efdX= I e'4,dT. 

Jxq Jt^ 

Hence by (c) and (d) the integrals 

jftf/dx, j^^'<t>dT, 

differ at most by 2«, and so, being independent of e they do not differ at all. 

In order to determine the form of / from (IV) one must first evaluate the 
right side, 


i 


' dt = • • • , a,); 

and then solve the integral equation. 


( 10 ) 


L 


efdx = 


It is the solution of this equation that usually presents the most difficulty. 
Particular forms of 0 that are being used are 

( 11 ) 

in which case ^ is said to be the “characteristic function” or “moment generating 
function” ; and 


( 12 ) e = xV ■■■ xV, 

in which case is a “moment function” or “moment” of /. Other forms might 
be used. For example, a very convenient method of demonstrating the correct- 
ness of the usual formula for the simultaneous distribution of the correlation 
(x), means (y, z), and variances («, v), in samples from a normal bivariate 
universe is by the use of 

0 ^ +»* +y* +#*) + a2(ttw* + y*) 


This method of finding / is not a final determination of the probability function 
desired until it has been shown that the solution is unique, a serious problem 
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in itself; it is one of those which Professor Shohat may consider.* There are 
three methods of solving the integral equation (10): 

(i) The first might be called guessing. Though unscientific, it is in fact 
often effective. Especially is it available if the distribution has already been 
surmised but not demonstrated. Thus, it was open to Student (1908) when 
he correctly surmised the distribution of the variance. Similarly it was open 
to Soper (1913) when he incorrectly surmised the distribution of r. 

(n) Papers by Romanovsky (1925) and Wilks (1932) have shown how the 
problem of solving the integral equation may be shifted to the problem of 
solving a partial differential equation, but this in turn may involve the solution 
of another equally difficult integral equation in the process or determining the 
arbitrary function. 

(m) If each a be replaced by an imaginary fii and one uses a Fourier trans- 
form, one arrives at a set of formulae which are most important. For the case 
where there is but one x and one jS, they may be written: 


(13) 



(14) 

II 

“ e-'’* m dp. 

-00 


Dodd (1925) has given an equivalent set of formulae involving only real vari- 
ables. It is easy to prove that both sets may be changed to the single formula, 

(15) /(a:) = ~ / (pdt I cos|3(x — g) dp. 

TT Jt Jo 

Kullbach (1936) has established the validity of the formulae corresponding to 
(13) and (14) for the general case of (P + Q) parameters. Wishart and Bartlett 
(1933) used the general forms to find the distribution of the generalized product 
moment in samples from an n-dimensional normal system. 

When the solution of the integral equations of (IV) cannot be found, one 
has to put up with the semi-invariants or with the moments of /. Formulae 
(IV) and (11) yield the semi-invariants, (IV) and (12) the moments about the 
given origin, and from either of these one may obtain the moments about the 
mean point. These methods are old but they are still important. Time does 
not permit me to discuss them, because it would not be proper to close this 
paper without some reference to limit methods. 

Limit Methods. It is well known that the distribution of means of samples 
taken from almost* any universe approaches the normal law as a limit as N 
becomes infinite. This theorem is subject to great generalizations, as is indi- 
cated in papers of A. Liapounoff (1901), S. Bernstein (1926), Romanovsky 

^ In a later paper at the same symposium. 

‘ There are exceptions. E. g.f means of samples taken from the universe a/ir(a -f- <*) 
have a distribution identical with the universe itself. 
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(1929, 1930) and C. C. Craig (1932). Subject to very general conditions it 
has been shown that: If the characteristic function of one probability (&tri- 
bution contains a parameter and approaches as a limit, uniformly in every 
finite domain of its variaUes, the characteristic function of another probability 
distribution; then the first distribution approaches as a limit the second distri* 
bution. Hence S. Bernstein and Romanovsky have shown that: If the universe 
is an 7^•way correlation solid of a certain very general tsrpe, then the n means 

obtained by a selection of a sample of N sets of variates, x, = ^ + • . . + tuf), 

(t = 1, • • • , n), have a distribution which approaches as a limit a normal 
correlation solid as N becomes infinite. A similar theorem has been established 
also in the interesting case of Romanovsky's “belonging coefficients^^ which 
include K. Pearson’s coefficient of racial likeness. Also, by the method of 
maximum likelihood, Hotelling (1930) has proved that under certain general 
conditions all optimum estimates of the parameters of a frequency distribution 
have a joint distribution approaching the normal as N becomes infinite. The 
validity of the method of maximum likelihood when used for this purpose has 
been established by J. L. Doob (1934). 

Finally, one may note an apparently new limit theorem of another type. 
Its general nature will be obvious from the following application: 

Let a sample of N be drawn from the universe, 

<l> = if <> 0, 

= 0 if < g 0. 

It is readily proved, by means of (IV), that the distribution f{x) of the para- 
meter, 

X = {ti + • - + tff) 

is a curve of the form, 

fix) = where x > 0, 

= 0 elsewhere. 

Now let X become infinite. The universe approaches as a limit the rectangle: 

= A where 0 ^ < 1, 

= 0 elsewhere. 

The parameter x approaches as a limit X, where X — maximum <». The 
distribution fix) approaches as a limit the new distribution, 

F{X) = NX”-^ where 0 < ) X | < 1, 

= 0 


elsewhere. 
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Hence we have proved in a new way, what was already known: that the distri- 
bution of the greatest variate obtained by sampling from a rectangular universe 
is of the form F{X). 

The limit theorem implicit in this illustration can be established in sufficient 
generality, but I do not yet know-whether it has other applications of value, 
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MOMENT RECURRENCE RELATIONS FOR BINOMIAL, POISSON 
AND HYPERGEOMETRIC FREQUENCY DISTRIBUTIONS* 

By John Riobdan 

1. Introduction. This paper gives the development of recurrence relations 
for moments about the origin and mean of binomial, Poisson, and hyper- 
geometric frequency distributions from the basis of the moment arrays defined 
by H. E. Soper.* This procedure has the advantage of expressing the moments 
in terms of coefficients which are alike for the three distributions and are de- 
rivable by a single process, thus providing a degree of formal coordination of 
the distributions. For both kinds of moments, the coefficients satisfy relatively 
simple recurrence relations, the use of which leads to recurrence relations for 
the moments, thus unifying the derivation of these relations for the three 
distributions. The relations derived in this way for the hypergeometric dis- 
tribution are apparently new. Apparently new recurrence relations for certain 
auxiliary coefficients in the expression of the moments about the mean of 
binomial and Poisson distributions are also given. 

This course of development involves repetition of a number of well-known 
results which is justified, it is hoped, by the unification obtained.* 


^ Presented to the American Mathematical Society, Sept. 3, 1936. 

* Frequency Arrays^ Cambridge, 1922. 
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2. Mmnent Amys. As developed by Soper, frequency distributions may be 
exhibited by frequency arra 3 r 8 , in the case of a single variate, in the form: 

(^.1) f(A) = Z P. 

X 

where p, are the frequencies with which the measures, x, of the character. A, 
occur in a population. 

The substitution A = c“ leads to the moment about the origin array; 

/(e‘) = Zpx«“ 


• = Zp.(l+*a + ^ + ••) 


where 


Z a 




The symbol a is a logical or umbral symbol serving merely to identify the 
moments in the expansion of the array. 

The moment array for moments about the mean is found from the relation : 

= 52 ! 

« 

where mi is the first moment about the origin. 

The moment arrays for the distributions concerned are as follows: 

Binomial /(e“) = [1 + p(e“ - 1)]" = Z P*(«" - D" 

x-O \X/ 


Poisson 


x'^O 3/ I 

. V' (O.W. (e“ - D* 


Hypergeometric f(,e“) = Z 

0 \V')x ^ • 

where the parameters p, n, and a for the binomial and Poisson have the usual 
significance. The parameters for the hypergeometric distribution, with the 
substitution r = s, follow Soper; Pearson (loc. cit.) uses g, r, w, where </ = i/n. 
The notation (Z)* means 

il)x = Z(Z - 1) • • • (Z - X + 1). 

It will be seen that, with the usual interpretation of as zero for x > n, 
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the three (^tributions so far as concerns a may be exhibited by a function 
of the fonn 

fie") - - ir 

where of course depends on the distribution concerned. 


3. Moments About the Origin. The moments about the origin can then be 
defined by the equation : 


(3.1) 

and 


S ! ft 


± - D* = i: A. £ (- 1)*-* (fj , 

XmmQ X— 0 V— 0 X*'/ 


= 


where S,., is a Stirling number of the second kind, as used by Jordan (loc. cit.) 
and defined by 

I !«...= 2:(-l)*^(*)t-* = A*0*, 

t>-0 \V/ 

being in the language of the finite difference calculus, a ‘^difference of 
nothing^^ that is A*n* | n = 0. 

The internal series terminates at s because Sx,$ — 0, a: > s, as is readily 
apparent in the finite difference expression. Further So,« = 0, s 0; So.o = 1- 
By equating coefficients in equation (3.1), w«, the sth moment about the 
origin, is given by 

8 

(3.2) Tils ~ ^ ^ X ! Ax Sx, 8 • 

*«-0 


The particular forms for the three distributions are as follows : 

8 

(3.3) m, = 2 (n)x p* Sx , « Binomial 

x—O 

» 

(3.4) m, — ^ a” Sx. , Poisson 

x-O 

(3.5) m, = ^ (0*^ ^ Hypergeometrie 

*-o {n)x 

The Stirling numbers have the following recurrence relation (Jordan loc. 
cit.) : 

(3.6) 


Sx, .+1 = xSx. $ + Sx-l, • . 



jom moKDAXf 


m 

This relatioBi in conjunction with equaticms (3.3)-(3.&) leads to moment recur* 
rence relations. The procedure is illustrated for the binomial distribution as 
follows: 

•+i 

^t+i — ^ (n)» p* Sx, »+i 
*—0 

= £ in), p* (® Sx. , + Sx-i . .) 

xH) 

s= p Dp m, + (npm, — p* Dp m,) 

- npm, + pg Dp m, 

where g = 1 — p. 

The steps in the process are expanded as follows: 

in^xP X Sx. a “ (n), p X Sx. , 

gm4) X— 0 

= 52 (n)x Sx. , pDpip*) 

x-»0 

= pDpm, 

22 (n), p* Sx-i. . = 22 (w - X -j- 1) (n)^_i p* St-\. , 

x »0 x—O 

= » 22 in)xp^^Sx. . - 22 x{n)xP*^^Sx. , 

x —1 x —1 

= npm, — p^Dpm, 

The results for the three distributions are as follows; 


(3.7) 

m,+i = npm, -|- pqDpm, 

Binomial 

(3.8) 

m,+\ — am, + aDpm, 

Poisson 

(3.9) 

\t 

m,+i = - m,(J, — 1, r — 1, n — 1) — (n 4- l)Anm« 
n 

Hypergeometric 


Here Dp and D# denote differentiation with respect to p and a, respectively, 
and An denotes the difference operation with respect to n. For the hyper- 
geometric distribution the moments are functions of I, r, and n as well as of s; 
m,(l — 1, r — 1, n — 1) is the same function of 1 — 1, r — 1 and n — 1 as 
m,il, r, n) is of I, r, n. Equation (3.9) appears to be new. 
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For convenience of reference, a short table of the Stirling numbers of the 
second kind follows: 


\ - 



s,. 

• 



x: 

0 

1 . 

2 

3 

4 

6 

0 

1 






1 

0 

1 





2 

0 

1 

1 




3 

0 

1 

■ 3 

1 



4 

0 

1 

7 

6 

1 


5 

0 

1 

16 

25 

10 

1 


4. Moments About the Mean. As shown in Section 2 above, moments 
about the mean may be defined as follows: 

(4.1) 

0.0 s I 

where mi is the first moment about the origin : 

mi = np Binomial 

= a Poisson 

= Ir/n Hypergeometric 

Now 

Z A. 6-”*“ (e“ - D* = z A, z (- I)*"* M e'""*’* 

x—0 *-*0 »-»0 / 

= x \ Ag (Tx,n 

0.0 S I *.0 

where 

zl T.,. = Z (- 1)*“' iv - m^y = A* (- m,y. 

«.o \v / 

It will be observed that for mi = 0, ctx.s = S*,*. The internal series terminates 
at s for the same reason as before. 

The moments about the mean are then given by: 

9 

(4.2) M. = 52 a:! 

»-0 


The particular forms for the three distributions are as follows : 


(4.3) 

M. = 52 («)* V* O’*. * 

Binomial 


x—O 


(4.4) 

t 

M. = Z ““o’*.* 

Poisson 


x—0 


(4.6) 

f-(0.(r). 

M* — • 

»~o {n)x 

Hypergeomeiric, 
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The eoeffieimts 7 ,,. satisfy the foUowiitg recurrence relation:* 

(4.6) == (® — m^<rx,$ + 

which in conjunction with equations (4.3)-(4.6) leads to moment recurrence 
relations as before. The actual derivation is somewhat complicated by the 
circumstance that o’,,.is a function of mi and therefore of the frequency param- 
eters, rather than a constant as before. The derivation is illustrated for the 
binomial distribution as follows: 

•+i 

M.+1 - iC 

= Z) (n)»p*t(a: - np)«r,,. + 

« 

= 2 («)* «•», • pDpif) - npn, + 2 («)* P* . . 

0 *— 0 

= pDp/ji, + mpiit^i - npn. npn, - p*lDfH, + ns/i^i] 

= pq [ns/u.-i + Dpn.l 

The steps in the process are expanded as follows : 

• 8 

2]] (w)* cr* , g pZ)p(p ) = (w)ap [p*Dp(p CTxta) P , »)] 

*"•0 


= pDpHs — p 2 (n)*p*(-" 

*—o 

= pDpju. + nspii,-i 

«+l fl+1 

(jO zP (^X—l,8 ~ ^ "1“ 1) (^)x— iP 1 I « 

x*-© x*0 

= (n), p*'*'’ V, . , - 13 a:(n)* P*^‘ . . 

x-0 

= tip fig P [Dp fig -f- TIS fig^\\. 

The relation = — nso-x.a-i is obtained from the definition equation of 

<tx,8 (with mi = np). 

The resulting recurrence relations for the three distributions are as follows: 

(4.7) /x«+i = nspqfig^i + pqDpfi, Binomial 

(4.8) Mt+i = + a Da iMg Poisson 


* Jordan, loc. cit. or E. C. Molina, An Expansion for Laplacian Integrals . . , , Bell 
System Technical Journal, 11, p. 571. 
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( 4 . 9 ) 


where 




(»+ 1) , r, n + 1) J 

n [''* ■ S - l,r - l,n - 1)] 


Hypergeometric 


Ir Ir 

n(n +1) " « 

(l-Dir- 

t» - 1) 


Ir 

n 


The last of these, which appears to be new, seems to be of formal interest only. 
The coefficients <r*,, are related to the Stirling numbers by the expression: 




.. = Z (-1)’ = Z a,m\ 


and consequently can be exhibited with detached coefficients in the form 
Oo + + 02 + ••• + For the binomial and Poisson distributions 

certain simplifications, to be developed in the section following, in equations 
(4.3) and (4.4) may be made. For the hypergeometric distribution it appears 
necessary to use equation (4.5); the following short table of <r*,., employing the 
detached coefficients mentioned above, is given for this purpose: 


1 

2 

3 

4 

5 


0-1 
O-fO+l 
o+o+o — 1 

O+O-fO+0+1 

O+O+O+O+O-l 


1-3+3 3-3 ^1 

1-4+6-4 7-12+6 6-4 1 

1-5+10-10+5 15-35+30-10 25-30+10 10-5 1 


5. Binomial and Poisson Moments About the Mean — Simplified Formulas. 
5.1 Binomial. From examination of the first few moments about the mean, 
it appears expedient^ to write the formulas: 

« 

W. = Z (wpff)* 

(5.1.1) 

M8.+1 = (9 — p) Z . *•+! (”P9)* 


* The kind of expression chosen admits of some variety. A recurrence relation for 
coefficients in the expansion *“ «* . • P* has been given by E. H. Larguier, On a Method 

x^l 

For Evaluating the Moments of a Bernoulli Disiribuiiont Bull. Am. Math. Soc., 42 , 1, p. 24 
(Abstract 8); I am indebted to Mr. Larguier for the opportunity of examining his results 
in advance of publication. 
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When these are substituted into the moment recurrence relation, the coefficients 
are found to be related as follows; 

«*.*• = [* + + (28 — 

— 2pg[l + 2* + 2'pqDf^axM-x 

ac,*!-)-! = I® + P3 -Dm]«*ju + 2sax-i,u-t 


or, in general, 


(5.1.2) 


a»,.+i = [» + 

- P?U - (-1)1 [1 + 2x + 2pqD„]ax,, 


Using detached coefficients of powers of pq as outlined above, these coeffi- 
cients may be exhibited as follows: 



2 

3 

4 

5 

6 

7 

8 
9 


1 


a«.. 

2 3 4 


1 

1 


1 - 6 
1 - 12 

1 - 30 + 120 
1 - 60 + 360 
1 - 126 + 1680 - 5040 
1 - 252 + 5040 - 20160 


3 

10 

25 - 130 15 

56 - 462 105 

119 - 2156 + 7308 490 - 2380 

246 - 6948 + 321^2 1918 - 13216 


105 

1260 


It may be noted that the coefficients of the first column in conjunction with 
equations (5.1.1) give the binomial seminvariants. 

Equations (5.1.1) make the coefficients functions of pq only; a slight alter- 
ation makes the coefficients functions of n only. Thus: 


(5.1.3) 


w. 




(p?)* 


1 = (? — p) 2 . **-H (p?)* 


and the coefficients are found to satisfy the recurrence relation: 

(5.1.4) = xfix., + naPx-i.,-! - [1 - (-1)*](2* - l)/3,_i.,. 

These coefficients may be exhibited by a rearrangement of the table given 
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above as may be seen by comparing equations (5.1.1) and (5.1.3). The first 
few coefficients are as follows: 

A T 

2 1 

3 1 

4 1 

5 1 

6 1 

5.2 Poisson. The Poisson moments about the mean may be expressed as 
follows: 

i«/*i 

(5.2.1) 

where [ ] represents “integral part of’ and 

(5.2.2) = XOlx ,$ "I* 80!»_l,i_i. 

The coefficients a*., are the constant terms in the expressions for the corre- 
sponding binomial distribution coefficients in powers of pg. 


2 3 


-6 + 3 
-12 + 10 

- 30 + 25 120 - 130 + IS 


Bell Telephone Laboratories. 



NOTE ON ZOCff S PAPER ON THE POSTULATE OP THE 
ARITHMETIC MEAN 

By Albert Wertheimer 

1. Introduction. There appeared recently a paper by Richmond T. Zoch^ 
entitled “On The Postulate of the Arithmetic Mean/^ The stated purpose of 
his paper, was to show that the derivation of the Postulate as given by Whit- 
taker & Robinson, is not correct. It is the purpose of this paper to show, 
that Zoch has not proven any error to exist in the Whittaker & Robinson deri- 
vation, but that there are a few errors in his paper. As this paper is intended 
to be read with Zoch’s paper as a reference, the terms used there will not be 
redefined here, and except where otherwise stated, the symbols used will have 
the same meaning. 

2. Zoch introduces the function 

/ — X 4" 

and claims that it satisfies all the four axioms of Whittaker & Robinson, and 
obviously it is not the arithmetic mean. He therefore concludes that their 
derivation must have errors somewhere, and proceeds to find them. Let us 
first examine the / function. Considering only the part m 8 /m 2 , the partial 
derivatives with respect to Xi are given by 

3M2{(a?i -- x)^ - M 2 } ~ 2tiz{xi - x) 

Ufil 

It is then stated (p. 172) “. , . clearly these partial derivatives are single valued 
and continuous. Therefore the function ^ 3/^2 satisfies axiom IV.'' Now, 
the condition that a function be continuous and single valued means of course 
that this be true throughout the region of definition of the function. It is not 
shown how these derivatives are clearly continuous and single valued for the 
very important case where all the x's are equal and the derivatives become 
indeterminate. As a matter of fact they are not continuous in this case, and 
therefore the / function does not satisfy axiom IV. To prove this, we only 
have to consider the very simple case where we let 


Xi ^ k + CiZ 


'This Journal Vol. VI no. 4, Dec. 1935, pp. 171-182. 
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where ft is a fixed constant, C{ is a set of arbitrary constants not all equal, and 
jr is a parameter. We then have 

f = + & 

M* = 

» » 

m = 

where 

d = 1/n 22 c, 

Mt = i/« 22 (c< ~ 2)* 

= 1/n 22 (c.- - e)* 

Substituting these values in / and the derivatives, we get taking a = 1, 

f = k + ZC + zVi/^Vs 

a//a«, . i/» + - e)’ - - 8) 

m Hz 

Now going to the limit when z approaches zero, and all the x’s approach k, 
we get 

limit / = k, 

x-*0 

limit df/dXi = l/n{— 2 + 3(c» — c)Vm 2 — 2 — c)/ai 2 *} 

*“*0 

Thus, when all the x’s approach the same value, the function f also approaches 
the same value independent of the c’s, that is regardless of the mode of approach, 
while the derivatives can take on any value depending on the c’s that is on 
how the limiting value of / is approached. The / function then does not have 
continuous single valued partial derivatives, and therefore does not satisfy 
axiom IV. 

In part 2 of the paper it is stated ‘‘Now when the x» all approach a then both 
/ and dfIdXj become indeterminate forms. However, in this case / takes an 
indeterminate form which can be evaluated and it can be shown that ms//^ 
will always have the value zero, i.e.,/ will have the value a when all the Xi—^ a; 
while the df/dXi can take any value whatever and in general the df/dXi will 
not be equal when the x< a.” This statement really amounts to saying that 
the f function does not satisfy axiom IV, but it is there used to demonstrate 
that one of Schiaparelli’s propositions is false. 

3. Having exhibited a function different from the arithmetic mean, and sup- 
posedly satisf 3 dng all the four axioms, the question is asked “ Wliere is the proof 
given by Whittaker & Robinson lacking in rigor?” After numbering the 
various steps in the derivation “. . . for the sake of rigor and careful reasoning 
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. . it is stated (p. 174), “The sixth step involves the tacit assumpticm that 
the partial derivatives are functions of k. These partial derivatives are not 
necessarily functions of k . . and it is therefore concluded that the sixth 
step is not valid. Now, how can any function that by definition is to be evalu- 
ated at $kxi not be a function of k"! What is shown (pp: 174-6) is that 
these derivatives do not necessarily involve k explicitly, Wt this is neither 
implied nor necessary for the sixth step, and there is no ground for doubting 
its validity. 

4. In order to overcome the supposed defect in the sixth step, it is proposed 
to change axiom IV so as to require the partial derivatives to be constants. 
But even then (p. 175) “. . . there remains an objection in the seventh step.” 
Now, the seventh step consists of the statement that if 

*l>iXi) = Y^CiXi 

where the c’s are independent of the x’s then due to the condition that ^ be a 
ssmunetric function, all the c’s must be equal. To show the defect in this 
step it is stated, that under certain conditions “. . . the function f ^ x + m/w 
will have partial derivatives with respect to x,- which are unequal and constant; 
yet at the same time the function / is a symmetrical expression of the n vari- 
ables.” Granting that all that is correct, what has this got to do with the 
seventh step? The / function certainly is not of the type 23 c.x.- to which 
the seventh step is applied. 

5. One more point should be mentioned. On p. 181 it is supposedly proven 
that any function satisfying the first three axioms must have continuous first 
partial derivatives. The proof is essentially as follows; Assuming all the x’a 
are given the same increment Ax, the increment of the function then is A^. 
It is then stated “. . . but by axiom I, = Ax. Therefore A^/Ax = 1 = d0/dx. 
In other words, the total derivative of ^ exists and is constant. Therefore the 
total derivative of 0 is continuous.” From this, the continuity of the first 
partial derivatives is proven by means of Euler’s Theorem for homogeneous 
functions. Now, just what does the symbol d^/dx (which is called the total 
derivative) mean for a function of many independent variables? Besides, 
(whatever this symbol means) is it considered rigorous to deduce a general 
Theorem from the very special case where all the differentials are made equal? 
This is one place where the / function could be used effectively as an exhibit 
of a function satisfying the first three axioms, and not having continuous partial 
derivatives. 

It is also stated (p. 181) that “. . . it would seem more satisfactory to postu- 
late that the function ^ is single valued, for the single-valuedness of a derivative 
does not insure the single-valuedness of the integral while the single-valuedness 
of a function does insure the single-valuedness of the derivative where the 
derivative exists.” This statement is certainly not self evident and requires 
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proof. For a single variable at least, it is easy to imagine a function repre- 
sented by a curve with comers defined in a certain interval. The function then 
could be single valued everywhere in the interval, while the derivatives at the 
comers may exist and have two. distinct values, depending on whether the 
comer is approached from the right or the left. On the other hand it is hard 
to imagine a curve representing a single valued function such that the integral 
i.e. the function represented by the area under the curve should not be single 
valued. 

6. In Conclusion: It is stated in the Introduction that ‘‘Since this book has 
had wide circulation, it is believed that the errors in this proof should be called 
to the attention of the users of the book. The present paper has been prepared 
for this purpose.” It is for the same reason, that this paper was prepared to 
show that no error has been proven to exist. 

Bureau or Ordnance, U. S. Navt Department 



NOTE ON THE BINOMIAL DISTRIBUTION 

Bt C. E. Clark 


The purpose of this note is to show that 

„) 

where n is an integer SO, 0<p<l, p + g*=l, and = x{x — 1) {x ~2) 
• • (a; — n), is a function whose values at a; == 0, 1, 2, • • • n are the successive 
terms of the expansion of {q + p)", and also to consider the problem of fitting 
j{x) to an observed frequency distribution. 

The statement made about (1) can be verified by evaluating (1) as an inde- 
terminate form. On the other hand, (1) can be derived by observing that the 
x-th term (x an integer) of the expansion of (g + p)" is 


( 2 ) 


n\ ^ Tin -f l)p*'g" * 

x!(n - x) 1 ^ ^ r(x + 1) f (n - x + 1) ' 


then (1) can be derived from (2) by means of the product expansions for r(x) 
and sin x. This derivation of (1) from (2) can also be carried out by expressing 
(2) as a Beta function and then using 


Bix + 1, n — X + 1) 


-r 




a + 


dt^ (- 1 )" 


Jn+l) 


(n + 1) ! sin irx ’ 


This integration can be performed by means of the theory of residues. 

Consider the problem of fitting (1) to an observed frequency distribution. 
We shall write (1) in the form 

(3) F{z) = oh* X = + h(z - 2) 


and determine the constants a, 6, n, and h so that, when 2 is the mean of the 

observed distribution, Fiz) will fit the distribution. 

The values of o, fc, n, and h can be determined by the method of moments. 

Let V 2 f and Vi , denote the usual second, third, and fourth moments of the 

distribution, which are calculated in the usual way (as in W. P. Elderton, 

Frequency-Curves and Correlaiion) and not adjusted by any procedure such as 

2 

Sheppard's adjustments. Also, use the usual notation ^ and ft = ^ . 

V2 V2 
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Then, the method of moments gives 


(4) 

( 6 ) 


2 



a *= (*-1)" - 7 ^ , where S/ is the sum of the frequencies of the distribution. 

?r(l + 0)** 

An integer n is chosen nearest the value assigned by (4). The two values of 
6 from (5) determine two curves that are congruent but whose skewnesses are 
of opposite sign. Hence, b is uniquely determined by (6) and the sign of the 
skewness of the data. 

For a symmetrical distribution, 6 = 1, i^s = 0, and 


3 — ft 



We shall consider an illustrative example. In the following table the columns 
f(z) and f 2 (z) are taken from W. P. Elderton, Frequency-Curves and Correlation 
(1906), page 62. f(z) is an empirical frequency distribution, while ft(z) is 
obtained by fitting a Pearson Type II curve to the distribution /(«). fi{z) is 
computed from 


/i(«) = 1624 X = 2.0973 + .SOSz 


which is determined by the method of this note. fi{z) is obtained by fitting 
the normal curve 


fziz) = 485.1c 


(•-. 4986 )* 

2 ( 1 . 829 )" 


2 


/.(*) 

/.fe) 

/.(*) 

-3 

11 

18 

14 

19 

-2 

116 

107 

109 

92 

-1 

274 

281 

286 

263 

0 

451 

438 

433 

444 

1 

432 

437 

433 

444 

2 

267 

267 

285 

263 

3 

116 

106 

109 

92 

4 

16 

18 

14 

19 


The coefficients of goodness of fit for fi{z)j fziz)^ and fz(z) are respectively 
.35, .58, and .02. 



CONVEXITY PROPERTIES OF GENERALIZED MEAN VALUE 

FUNCTIONS* 

Bt Nilan Nobbis 


Consider the following generalized mean value functions; (1) the unit weight 
or simple sample form, <t>(t) = ^ ^ in which the X{ are posi- 


n 

tive real numbers not all equal each to each, and in which t may take any real 

value; (2) the weighted sample form, «(<) = .“h 

\ Cl + C 2 + • • • + c« / 

in which the c,- are positive numbers not all equal each to each, and in which t^e 
Xi and t are restricted as in (3) the integral form, 6{t) 


where 


r r'. 

J x^O 




dx exists for every real value of t; and (4) the generalized integral 


form 4'(<) =* J J‘, where ^(x) is a non-decreasing function integrable 

in the Riemann-Stieltjes sense such that — ^^-(0) = 1, and such that 

/ x*d4>{x) exists for every real value of t. The facts that all of these func- 

0 


tions are monotonic increasing and that both ^(0 and w(0 have two horizontal 
asymptotes have been previously demonstrated.* Although the existence of 
0(0 and ia{t) has been known since 1840, there appears to have been no attempt 
made to investigate the behavior of the second derivatives of them.* 

When the are price relatives, production relatives, or similar data, 4>{t) 
and a)(0 yield common t 3 rpes of index numbers by direct substitution of integral 
values of t For any values of t such that 0 < < oo , the type bias of 

0 (^ 2 ) will be greater than the type bias of 0(^i). Similarly, for any values of t 
such that — 00 < <1 < < 0, the type bias of 0(^i) will be greater than the 

type bias of 0 (^ 2 ). The second derivatives of 0(0 and cu(0 indicate whether 


^ Presented at a joint meeting of the American Mathematical Society, the Econometric 
Society, and the Institute of Mathematical Statistics at St. Louis on January 2, 1936. 
The writer is indebted to C. C. Craig, Einar Hille, Dunham Jackson, and J. Shohat for 
helpful critical reviews of the preliminary draft of this paper. 

• G. H. Hardy, J. E. Littlewood, and G. P61ya, Inequalities (Cambridge University 
Press, London, 1934), pp. 12-15; and Nilan Norris, Inequalities among Averages,*' Annals 
of Mathematical Statistics^ Vol. VI, No. 1, March, 1935, pp. 27-29. 

• Jules Bienaym4, SocibU Philomatique de Paris, Extraits des proc^s-verbaux des stances 
pendant Fannie 1840 (Imprimerie D'A. Ilen6 et Cie., Paris, 1841), Stance du 13 juin 1840 

p.68. 
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tsrpe bias is changing at an increasing or a decreasing rate as between the un> 
limited number of averages available for use. Considerable interest attaches 
to w(0, the weighted sample form of function. 

Let «(t) be made arbitrary for the case of n = 2, with = 1, and Xt = 
where X is any real number. Also let Ci = a, and Ct — fi, where a + ^ = 1. 

Then w(f) = [a + Now for all values of t, 

a + 06 =l-Y<4--yf + ••• 


For 1 1 1 sufficiently small, it follows that 

log (a + |8e-'0 = -0U + i (l - /3)t* + /3X‘ ^ | t* + 

so that for < 7*5 0 

i log (a + /Se-"') = -/3X + i/3X*(l - i8)< + /3X*[-^ + |-|]t* + 
Therefore uit) = exp. |^j log (a + J 

- €-*^[1 + i/3X*(l -/3)< + ^X*|-g + ^-|Vlu(l -j8A|t*+ •••]. 
It follows that w"(0) = 2/3X*c‘^" 

that w(0) is the weighted geometric mean, and that ^(0) is the unit weight or 
simple sample form of geometric mean. As a means of demonstrating the range 
of values which w"(0) may take it is helpful to rewrite the expression for (0) 
as follows: 


«"(0) = i (1 - ^)V X - i e-^ = /(X, 0). 


This consideration makes it possible to distinguish three cases of y = /(X, 0) 
for fixed 0, namely, 0 < 0 < ^; 0 = ^; and i < ^ < 1. In all three cases 
/(X, 0) has an absolute minimum n(0) ^ 0, and = 0. The corresponding 


values of X satisfies the quadratic equation X* — 


4 4 - 5/J 


X + 


4-8/3 


= 0 . 


3 | 8(1 - 0 )-' ' / 3*(1 - 0 ) 

It is clear that by taking 0 near enough to 0, one can make n(0) as large negative 
as is desired. Also, by choosing X properly, one can make w"(0) take any 
value between ii(j0) and « . For example, when a = 0 = ^, X may be selected 
so as to make <o"(0) arbitrarily chosen non-negative number. For then 
X* - - 

w"(0) = e *( and as X increases from — w to 0, w"(0) decreases from « to 
64 

0. If X = 0, «"(0) = 0. If X > 0, as X increases from 0 to 8, w"(0) increases to 
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64e~*, and as \ increases beyond 8, «''(0) decreases, approaching 0 as X increases 
indefinitely. It is evident that the case of a =* |S ■» §, with X » —log 2, *i «■ 1, 
and Xt B is one in which ait) becomes the unit weight or simple sample 


t 3 rpe of generalized mean value function, namely, ^(f) 



Reference 


to the first expression above noted for a"iO) will make clear that ^"(0) — 
(log2)« 

64 


y/2 in this special case. 


Analysis of $(0, the generalized integral form of generalized mean value 
function, makes it possible to characterize populations of a very general char- 
acter, as well as samples. But in the case of $(t) it is even more difficult to 
generalize as to convexity properties. For example, let 



where 


Eiu) = 4= 

■Vv 





dv. 


This expression is obviously of the required generalized integral type. Now 


[#(<)]' = 4= f* = 

VTT « 



dw = 6*. 


_ g 

Therefore 4>(f) = e*, and ^"(0 = > 0 That is, in this particular 

10 

case, $(f) has only one horizontal asymptote. 

The foregoing examples indicate that the following conclusions may be drawn 
as to the diverse convexity attributes of the various means as functions of t : 
(1) The unit weight form, ^(f), and the weighted sample form, u{t), must always 
have a point of inflection, since both of them not only increase with t, but are 
doubly asymptotic (have two horizontal asymptotes). (2) Points of inflection 
for ^(0 and w(0 do not necessarily occur at < = 0. (3) The generalized integral 
form, $(0, need not always have a point of inflection. That is, the second 
derivatives of certain forms of $(<) do not change their sign, since such forms 
are concave upward. 


Unitebsitt or Michigan. 



A SIMPLE FORM OF PERIODOGRAM 


By Dinbmorb Alter 

Schuster’s introduction of a method of systematic search for hidden periodici- 
ties and cycles opened a new field for the investigator of statistical data. The 
beauty of his method in its analogy to analysis of light, and the great reputa- 
tion of its author, combined to give it universal acceptance and to blind statis- 
ticians to its faults. 

In more recent years at least three new mathematical and two mechanical 
forms of periodogram analysis have been proposed, each of which exhibits 
certain advantages over the original one. The use of the term periodogram 
for these forms is an extension of Schuster’s original definition which used as 
abscissae quantities proportional to the squares of the amplitudes of the sine 
terms found in the data for the various trial periods. He wrote: ‘Tt is con- 
venient to have a word for some representation of a variable quantity which 
shall correspond to the spectrum of a luminous radiation. I propose the word 
periodogram and define it more particularly in the following way: 

rti^T rti+T 

Let \Ta = j f(t) cos ktdt and ^Tb = jf fit) sin ktdt 

2ir 

where T may for convenience be chosen equal to some integer multiple of , 
and plot a curve with ^ as abscissae and r = -y/a^ + as ordinates; this curve, 

tC 

or better, the space between this curve and the axis of abscissae, represents the 
periodogram of fit)” 

The following appear to be the essential criteria for a satisfactory form of 
periodogram: 

1. It must exhibit plainly any repetition of form in the data regardless of 
how irregular the shape of the repeated interval may be. In doing this it 
must exaggerate the amplitude of the main terms at the expense of the 
lesser ones. 

2. The calculation of the indices must be short. In a periodogram from 
many data the indices sometimes are computed for several hundred trial 
periods. 

3. There should be a geometrical interpretation of the index used. 

4. The frequency distribution of the index must be known. 

5. Combining or smoothing the data should modify the index in a manner 
which leaves an obvious interpretation. 
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The Sohustec periodogram has the foQoving disadvantages: 

1. Only sine terms of large amplitude are exhibited. A perfect repetition 
of an extremely irregular form of data would not be indicated in any way. 

2. The calculations are long. 

3. There is a considerable uncertainty in the length of the period found. 
Those idethods of analysis which use harmonics as well as the fundamental 
have much less of this uncertainty. 

The correlation periodogram has advantages in each of these points over the 
Schuster. However, even with it the calculations are fairly long. Further- 
more, the modification of the coefficient introduced by grouping or smoothing 
is not a linear one. 

The periodogram described here is a slight modification of one for which a 
preliminary note was published in 1933. Additional features have been studied 
and its applications to many data have shown its ease of calculation. This 
calculation has been reduced still more by a mechanical method which renders 
it practicable to contemplate the possibility of studying many data hitherto 
prohibited by excessive cost. 

Consider data xo , xi , zt , •••*<, • • • *(b-i). Let I be any inte^r less than n. 
Form the sum of the absolute values of — X(i-i), designated by 2^ I ~ . 


Define A 


z 

is called the Icig 


1 Xi - X(i-l) I 


I takes the values of the various trial periods and 


n — Z 

Ay therefore, is the mean error between prediction that data 
will be repeated after a lag of I and the fulfillment of the prediction. Such 
an index has a meaning that is immediately of use to a meteorologist or other 
investigator. Coefficients such as the Schuster and the correlation coefficient, 
although valuable statistically, are of less immediate interest. 

The standard deviation of these errors of prediction follows at once from 
standard formulae under assumption of normal distribution. 


<r = 1.25 A 


The distribution of <r, as computed from the absolute values of data, has 
been studied by Helmert and by Fisher. Davies and E. S. Pearson have com- 
pared the various methods of estimating cr. For the large number, (n — Z), 
pairs of data used for a periodogram point, this method becomes almost as 
precise as the usual one which would square the values of (x* — x».-z). For 
(n — Z) as small as 60, the standard deviation of the standard deviation by this 
method is only seven percent larger than by the other one. Fisher has shown 
that 




<r 

\Ai — I 



as (n — Z) — ♦ 00 


This may be written as 

1.068 (T 

V2(n - 1) 
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Thr distribution approaches normal rapidly and for all values of (n — i) that 
would be used in iieriodogram calculation certainly may be considered as normal. 
It will be very s(‘ldom that a value of (n — 1) much smaller than 200 will be 
used. 

Tlie data may be printed on two strips of adding machine tape held together 
by clips so as to match data separated by a lag /. In arranging them for investi- 
gation, it usually is most convenient to make all numbers positive. The 
computer subtracts mentally and puts the difference into an adding machine, 
which gives him A almost immediately. 

For some computers, and especially where the numbers are large, another 
method of obtaining A may save time or lead to less numerical mistakes. The 
comjniter will form the sum of all his data. He wdll, as for the other form of 
comiMitation, put th(*se on two pieces of adding machine tape that he lays side 
by side. HowTver, instc'ad of putting the difference of the pairs into the ma- 
chine, he will, in each case, put in the smaller datum of the pair. Then, 

(v — 1) A I = 2 a^ll data — [X] l«t — 0 + S 0 data] 

— 2 X) smaller 

The (U'rivation of this e(|uation is obvious. In computing by this method the 
subtotak'r on the machine can be used to make the strip of sums of the first 
(71 — 1) data and of th(‘ last (ti — 1) for all values of 1. The first term on the 
right hand side* is a constant, the last is tw'i(*(* the sum of the smaller numbers 
chosf'ii in the pairs. I have computed by both nu^thods, and wdiere the numbers 
are small, I ])refer th(‘ former. Where tlu^y are large, I prefer the latter. How- 
('ver, when one must us(^ com})aratively untraiin^d computers, he will find less 
mistake's made if the' compute'r does not make the subtractions. 

The calculation of .1 is much shorte'r than that for the indict's even of the 
correlation and variance' periodograms. It may, howTve'r, be shortened even 
more by a mc'chanical arrangement, {n — l)Ai is the area bc'twwn two histo- 
grams of the* data matched after a lag /. These may be carefully graphed on a 
large scale and two such graphs superposed over a table with a translucent 
illuminated to]). On the edge of this table is the track to guide a rolling pla- 
nimet('r. .4, as comjnited by this means, is accurate to approximately one-half 
of one percc'iit of its value, a much more exact value than is needed. The 
details of such a device' as constructed for the Griffith Observatory are show^n by 
the acci'inpanying i)hotograph and diagram. Th(' dual saving of time by the 
method and by its mechani(*al application have' resulte'el in the aele)ption of a 
mue*h me)re ambitious program e)f meteorological rese'are'h than pre'vienisly was 
conle'iiiplateel. 
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The form taken by the periodogram is important. Consider the simplest 
case, data which follow a sine curve. 

/ 27rl — A 

Ui - = 2a sin — T sin AJl /'I 

PL P J 

The temi in brackets takes values distributed around the circle and the part 
outside is a constant for any one lag. The bracket term sums approximately to 



TT 


- , since we consider all terms as of one sign only. 



If the absolute values w(‘re not considered in the expression for Ai, tlie periodo- 
gram would be a sine curve* of period 2p. The lack of sign gives a cusp curve 
with the cusp at lags p, 2p, etc. Such a form is advantageous in that the 
j)eriodogram gives sharp peaks at multiples of the j)eriods which may exist. 

The effect of the periodogram in exaggerating the principal terms at the 
(‘xjiense of the smaller ones may be obtained most easily by equating <r as 
obtained by the linear and the quadratic formulae. 
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The data may be written as the sum of cosine terms 

( 2iri — ipt\ , , / 27rf — ipb\ , 

Vi - Vi-i = 2(1 sin - fsin ij’*! + . . . _ c_._,) 

Pa L P« J 

H iVi — Vi-if = 2(n — l)c^ sin’* — + 2(n — l)b^ sin** — + •••+(« — 1) \/2 ot 

Pa Pb 

The sine terms contribute to in proportion to the squares of their ampli- 

2 TtI • 

tudes. On account of the sin — factor, they contril)ute very little to values 

Pi 

of A I for which — is not very closely an even multiple of ir. 


This method has been applied to rainfall data of the Pacific Coast and has 
proved as satisfactory in practice as would be expected from the simplicity 
of the theory. The periodogram of rainfall stations along the northern third 
of the California coast is shown here, exhibiting [jcrhaps the most definite 
ifingle piece of evidence ever found for rainfall cycles. Outstanding is a cycle 
of about 45 years with its fourth harmonic as the s(‘Condary feature. I'he 
writer expects to publish the results of that work in the Monthly Weather 
Review. 





ON CERTAIN DISTRIBT7TIONS DERIVED FROM THE MULTINOMIAL 

DISTRIBUTION' 

By Solomon Kttllback 


1. Introduction. With the multinomial distribution as a background, there 
may be derived a number of distributions which are of interest in certain prac- 
tical applications. Several of these distributions are here presented and the 
theory is illustrated by specific examples. 


2. Preliminary data. In the discussion of the distributions to be considered 
there are needed certain factorial sums whose values are now to be derived. 
In the following discussion only positive integral values (including 25ero) are 
to be considered. 

There is desired the value, in terms of N, n, r, of 


( 2 . 1 ) 


fri.n,N) = 52 


N\ 


Xi\ Xfl • • • Znl 


where the summation is for all values of Xi, Xt, • • • yXn such that Xi + X 2 + • • • 
+Xn = N and no x is equal to r. 

Let us first consider the case for r = 0; i.e., we desire a value for the sum in 

(2.1) for all values of , X 2 , • • • ,Xn such that Xi + X 2 + • • • + Xn == iV and 
no X is equal to zero. By the multinomial theorem, we have that* 

(2.2) (ui + <^ + * * • + <*n)^ = 53 . dV a?* • • • On" 

" Xil X 2 I • • • Xnl 


where the summation is for all values of Xi , X 2 , • • • , such that Xi + X 2 + • • • 
+ Xn — N, If Oi == aa = • • • = a„ = 1, then 


(2.3) 




N\ 


Xi\X2l XnV 


Xl + X2 + • • • + Xn — N, 


The sum in (2.3) may however be rearranged into the sum of a number of 
terms as follows: 


N\ 

' Xi\X2\ • • • Xnl* 


Xi + xj + - • • + Xn = iV, no X = 0; 


Nl 


(2.4) 


wSrm + • • • + a:.v-i = AT, no x = 0; 

XiIXj! • • • Xn-l! 

~ . P 22 _ _ — ii a:i + X* + • • • + Xn-s = N, no X = 0; 

Z XilXtl • • • Xn-j! 


/n\ ^ Nl 
\rJ^Xi\X2l - Xn^rV 


Xi + Xj + - • • + Xn-r = iV, no X = 0. 


^ Presented to the Institute of Mathematical Statistics January 2, 1936. 

* H. S. Hall A S. R. Knight, Hightir Algebra, MacMillan A Co., 4th Ed. (1924), Chap. 15. 
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Thus we may rewrite (2.3) as 

*= fa(jn, N) + nf oin — 1, N) 
(2.6) n(n-l) 


+ 


2 ! 


/o(n — 2, N) + • • • + 


(”)/«(« - r, 


N) + 


Replacing n by n — 1 in (2.6) there is obtained 
(n- l)'' = /«(n- 1,JV) 

( 2 . 6 ) 


+ (n - l)/o(n - 2, iV) + . • . + 


(" 70 "" 


r- 1,N) + 


Multiplying (2.6) by n and subtracting the result from (2.6), there is obtained 
n" — nin — 1)" = foin, N) 

(2.7) — 1) 


2 ! 


!/.(n -2,N) ” j)/o(n -r-l,N)- 


Replacing n by n — 2 in (2.6) there is obtained 
(n-2)" = /o(n-2,iV) 

+ (n — 2)/o(n — 3, N) -1- • • * + ^ j. _ 2 ^/o(n — r — 1, N) + • • • 
'Multiplying (2.8) by n(n — l)/2 and adding the result to (2.8), there is obtained 
n" - n(n - I)'' + (n - 2)" = /„(n, N) + 


(2.9) 


21 


3! 


/o(n — 3, iV) + • • • + ■^" 21 — ^ ^ — r — 1, iNT) + 

Continuing this process, there is finally obtained the result that 
(2.10) /o(n, N) = — n(n — 1)^ + ^^^ 2 ! ~ ’ ± n* 1^ 


It may be shown® that the right side of (2.10) is A'^x^ for x = 0. The author 
has felsewhere obtained (2.10), but by a special procedure not applicable to the 
general case.^ 

We may readily verify (2.10) for example, for n = 3, = 5. If Xi + X 2 

+ iCs = 5 and no x = 0, then the sets of solutions are (3,1,1), (1,3,1), (1,1,3), 

(2,2,1), (2,1,2), (1,2,2), and /o(3,6) = + 3 = 150. From (2.10) 

there is obtained /o(3, 6) = 3‘ - 3.2® + 3.2/2 = 160. 


» E. T. Whittaker & G. Robinson, The Calculus of Observations^ Blackie & Son Ltd. 
(1924), p. 7. 

* S. Eullback, the Bernoulli Distribution,** Bull. Am. Math. Soc.y December, 1936. 
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For the gener^ case, we return again to (2.3) and rearrange the right side 
into the sum of a number of terms as follows: 


^ Xi\Xt\ ' ’ • Xn\’ 

m 

rl ^ xi!**! • • • Xn-i!’ 

n(n- 1)^ N\ 
2!<'rns 


*1 + a:* + • * * + = JV, no a: = r; 


*! + *»+•••+ x„-i = N — r, no * * r; 


*i + ^+---+*»-* = ^-2r, nox = r; 

Thus we may rewrite (2.3) as 


( 2 . 12 ) 


„ T^(r) 

n" = /r(n, N) + - 1, iV - r) 


where iV'*’ = N^N - 1){N - 2) • • • (AT - fc + 1). 

Replacing n by n — 1 and AT by AT — r in (2.12) there is obtained 

(n-l)*^-' = /,(n- l,Ar-r) 

(2.13) („ _ iy w _ 

+ - 2,Ar - 2r) + 


Multiplsdng (2.13) by — ^ and subtracting the result from (2.12), there is 

r! 

obtained 

(2.14) n" -^{n- l)^"' = /,(n, N) - !^(!^=i^/,(n -2,N-2r) 

By continuing this process, in a manner similar to that used for the case r = 0 
there is finally obtained 


iv nN^' 


.K-r , n{n — DAT^' 


(2.15) 


/,(n,Ar) = n -_(n-l) + 


.(n-2)' 


/n\Ar«\ 

-Ww*' ’ + 


By setting r = 0 in (2.15), there is of course obtained the value already 
found in (2.10). 

We may readily verify (2.15) for example, for n = 3, AT = 5, r = 2. If 
+ a:» + *8 — 5 and no x = 2, then the sets of solutions are (5,0,0), (0,5,0), 
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(OAS), (4,1,0), (1,4,0), (1,0,4), (4,0,1), (0,1,4), (0,4,1), (3,1,1), (1,3,1), (1,1,3), 
and/, (3,6) * 3-51/5! + 0-51/4! + 3-61/3! « 93. From (2.16) there is ob- 
tained /, (3, 6) = 3‘ - 3-6-4-2V2I + 3-2-6-4-3-2/21(2!)* = 93. 

The same method of procedure may be applied to evaluate 


(2.16) W - E 


Thus, there is derived the result that 


®i + a^ + ••• + 


no * «: r, «, 


t,N) = n" — n^‘ 


- D* 
r! 


i\r'*’(n - D* 


+ n(n — l)i 


- 2)^-*' , iV‘^*>(n - 2)"-'-* 


(2.17) 


' 'V 2!(r!)» 

JV‘**’(n - 2)''~**\ 
2!(s!)* ; ” 


n(n — l)(n 


(r!)(«!) 

3!(r!)‘ 


Ar«'+*^(n - 3)"^-*'-* Ar<'+**>(n - 3)''-'~** iV‘**’(n - 3)'^-**\ 

2! (r !)*(«!) + 2! (r !) («!)* + 3!(«!)» / 

We may readily verify (2.17) for example, for w = 3, W = 6, r = 0, s = 2. 
if *1 + X, + a:, — 6 and no x = 0 or 2, then the sets of solutions are (3,1,1), 
(1,3,1), (1,1,3) and/o,(3,5) = 3-51/3! = 60. From (2.17) there is obtained 
/m( 3,6) = 3' - 3(2‘ + 6-4-2V2) + 3-2fl/2! + 5-4/2! + 5-4-3-2/(2I)') = 60. 
It will be shown later (see section 8) that 


(2.18) 


Mn, N) = /„(n, JV) + ^ Un -1,N -a) 




(2.19) 


f.{n, N) = /„(«, N) + ^ Un -l,N-r) 


, n(n - 1)W‘*'’ . f _ ^ , 

2! (r!)* — -2,N — 2r) + 


From (2.18) and (2.19) there may be derived, by a method similar to that 
employed in deriving (2.15), that 


Un, N) = Mn, N)-^ Mn -l.N-a) 


( 2 . 20 ) 


, n{n — 1)JV‘**’ o xr _ o.\ 

2! («!)* ^ ^ 


This latter result also follows from (2.17 and (2.15). 
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Let iis now consider the following generalization of (2.1). There is desired 
in terms olN,n,r,ai,at, • • • , a„ , the value of 

r\.M \ rrt / wt \ m x% m* Xm 

V 41 .AX/ x'fV/fr, -IT , ui, aj, ’ , i*ny = ^ ZTZTi ITl 

XilXil •• • Xnl 

where oi , o* , • • • , o„ , are constants and the summation is for all values of 
XifXt, • • • , x„ such that Xi + x* + • • • + Xn = N and no x = r. The method 
of procedure is the same as that for the case already considered, viz when 

Ol = Oj = • • • = On = 1. 

The sum in (2.2) may be rearranged into the sum of a number of terms as 
follows : 


— j i af'or---a^r, xi + x*+ \-Xn = N, nox = r; 

Xi! Xj! • • • Xn' 


oI N\ „*j <^’n ^ ./V ! „»i . . . „X,-1 

• + 7 ! ^x/...Xn-»!“‘ 

xi + • + *n-i — N — r, etc., no x = r; 

(2.22) : , - 

^ • • • On* + • • • 

(r!)* x*+i! • • • Xn! 

I • • • On 'Y' _*I ... „*•-» 

(r!)* 

+ * * • + Xn^k — N — kr, etc., no x = r; 


/i*i . . . 
di dn--i f 


For convenience, let us write 


A(n, iV) = (fli + a 2 + • * • + 

Ai(n — 1, iV) = (ai + • • • + + • • • + 

Aij(n — 2, iV) = (ai + • • • + a,-i + 0,4.1 + • * • + 0,-1 + 0,41 4- • • • + On)^ 

N) = Fr(n, iV, Oi, O,, • • • , On) 

Gr(n — 1, JV, Oi) == Fr(n — 1, N, oi, a*, • • • , o,_i, a,+i, • • • , On) 

(?r(n - 2, AT, Oi , o,) = Fr{n - 2, AT, oi , • • • , o<_i , o.+i , • • • , a,_i , o,+i , • • • , On) 
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From (2.24), there are obtained n equations 


i4<(n — 1, AT — r) as Grin — 1,N — r, o,) + 


iN - rY 


(2.25) 


a'jGrin -2,N - 2r, a<, o,) + • • • (t = 1, 2, • • • , n, jV 1) 


Multipl3ring (2.25) by alN^'^/r] and subtracting the result from (2.24), there 
is obtained 


(2.26) 


« r ^(r) 

Ain,N) — 22 — l,.iV — r) = Grin, N) 

«-i r! ■ 

JV’”’’’ A 

- 2; (r!)2 ,.2 -2,N - 2r, at, a,) - 


a 7^ j, etc.). 


Continuing this procedure, there is finally obtained 
Grin,N) = Frin,N,ai,at, ••• ,a„) = Ain,N) - 

(2.27) ^ a\Aiin - 1, JV - r) + 57^77^ S o<ay^.j(n -2,N - 2r) 

V" v t.y— 1 


(t i, etc.) 


Similar results are obtainable for 


(2.28) Gr,...i — Frt-.-tin, N,ai, at, • • • , On) = 22 — i — j i 0 ** 02 * • • • o^* 

xi! X2I • • • x„! 

where the summation is for all values of x, such that Xj + X2 + ■ • • + x„ = AT, 
and no X = r, s, • • • , or <. 

Thus, it will be shown later (see section 8), that 


(2.29) 


Grin, N) = Gr.in, AT) + ^ 22 oJaXn - 1, AT - s, o<) 

8! ,--i 

oTTTTii ^ didjGrtin 2,N 2s, o,-, o,) + • • • 

2! (s!)^ i,,-i 


a ^ j, etc.) 


Corresponding to the derivation of (2.27), there is. obtained from (2.29) 
the fact that 


(2.30) 


Gr.in, N) = Grin, AT) - 22 a'iGrin -l,N-s, o,) 

5! <-1 

^(2s) n 

21 Gr{u — 2, iV^ — 28 f di , a,) — • • 


(t yii J, etc.) 


3. The problem to be studied. Consider a trial in which one of n mutually 
exclusive events may occur, with the respective probabilities of occurrence 
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Pi > P* > •••>?« where pi + ps + • ’ • + P* = 1* The probabilities of the 
various combinations of events which are possible in N trials are given by the 
terms of the expansion of (pi + Pj + • * • + Pn)*'. 

In the N trids some of the possible events may not occur, others may occur 
one, twice, etc. It is desired to study the distribution of the number of events 
which do not occur; the distribution of the number of events which occur once 
each, etc. The simultaneous distributions of the events above described are 
also to be studied. 

For example, the possible event may be the occurrence of a digit. A study 
of a sequence of random digits, in sets of ten, yielded the following three 
sample sets. 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

1 

0 

2 

1 

1 

2 

1 

0 

0 

2 

1 

1 

1 

1 

1 

1 

2 

0 

1 

1 

0 

0 

2 

1 

2 

1 

2 

1 

0 

1 


Fig. 1 


In the first set three events do not occur, four occur once each, and three occur 
twice each. In the second set one event does not occur, eight events occur once 
each, and one event occurs twice ; etc. 

4. Distribution of the number of events not occurring. To obtain the distri- 
bution of the number of events which do not occur, there is applied to the 
expansion of (pi + Pi + • • • + Pn)'' a procedure similar to that employed 
in section 2. 

Thus, if TTro represents the probability for r events not occurring, then 


Too 


N\ 


~ ^ xj! ij! • • I ! P*' P** ’ *1 + ^ + ■ ■ ' + 

no X = 0; 

• ^ + • • • + 53 . 

* xi -I- xj -f • • • -f- x„_x = N, etc., no X = 0; 


TrO 


_ y — • • • o*" -I 4- y - vv . . . 

~ ^ x,+i! • • • x„! ^ Xi! • • • Xn-r! ’ 

xi + *» + • • • + Xn-r — N, etc., no X = 0; 
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Employing (2.21), we may write (4.1) as 

TOO = Fo(n, JV, pi , p, , • • • , p„) 

(4.2) = ^o{n-l,N,p», ••• ,p«) + ••• +Fo(n- l,N,pi,pi, •••,p,_0 

[thi = Fo(n - r, AT, pr+i , • • • , p») + • • • + Fain - r,N,pi, ••• , p„_) 
Since pi + p» + ■ • • + Pn = 1 there is found from (2.27) that 


(4.3) 


TOO = 1 - 2 (1 - p*)*^ + i i: (1 - P< - Pif 

1-1 i,/-l 

- li . (1 - Vi -Vi-VhY 
Tio = 2 (1 ~ vd" — .2 (1 - p< - p,)" 

+ ^ 2 (1 — Pt — pj — vkY — • • ■ 

i,i,k-\ 


i-1 


i.j-l 


IT JO 


= 2 (1 - p< - vd” - 2 (1 - p.- - Pi - pk)" + • • • j 

(t 9^ j, etc.) 




The factorial moments^ of the distribution given by (4.3) are easily derived. 
The first factorial moment is given by <ti = irio + 27r2o + Sirso + * * • + rvro + • • • 
and the summation of the proper terms in (4.3) yields 


(4.4) 


(Ti = E (1 - 




In general, the r-th factorial moment, given by ar = E Hk — 1) 

h^r 

(k — r+ l)TMis 


(4.6) ar- 2 (1 — Po — Pi — • • • — Pr)", (o 9^ b, CtC.). 

a.b,' • '.r— 1 

Indeed, (4.3) illustrates the fact that, if f{x) is the probability that a discon- 
tinuous variate takes the value x, then® 


(4.6) 


/(*) = A 2 (-i)^Wfc! 

Xl jfe-o 


‘ J. F. Steffensen, Interpolation (1927), p. 101. 

• J. F. Steffensen, ‘‘Factorial Moments and Discontinuous Frequency Functions** 
Skandinaviak Aktuarieiidakrift^ Vol. VI (1923), pp. 73-89. 
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The moments about any constant of the distribution given by (4.3) may be 
derived from the factorial moments by the relation^ 

(4.7) E(^x — a)*^ = (1 + (tiA H-.a’2AV2! +•••-!- /tV) ({ s= — a) ^ 

where A is the difference operator of the calculus of finite differences, and { 
is replaced by (—a) after the indicated operations have been performed. 


Of special interest is the case when pi 


P2 = 


= Pn = for which (4.3) 


becomes 


(4.8) 


I - (0 = Q’i-O" 


^rO 



/o(n - r, N) 





where /o(n, N) and A"0'^ are as defined in section 2. The probabilities in (4.8) 
are the respective terms of the expansion of ay (1 +A)".0^ 

For this case the r-th factorial moment becomes 


(4.9) (Tr = n(n — 1) • • • (n ~ r + 1) (^ — r)^/n^ 

There is presented an example of the distribution (4.8) for the case n = iST = 10. 


It is found that* 







AO'® 

= 

1 

A*0“ = 

16435440 


a'o'® 

= 

1022 

a’o*® = 

29635200 

(4.10) 

A®0'® 

= 

55980 

A*0“ = 

30240000 


a'o'® 

= 

818520 

A’0‘® = 

16329600 


aV® 

= 

5103000 

A‘®0“ = 

3628800 


TToo 

= 

.000362880 

TTSO = 

.128595600 


^10 

= 

.016329600 

TTeo ~ 

.017188920 

(4.11) 

^20 

= 

.136080000 

TTTO = 

.000671760 


wso 

= 

.355622400 

TTSO = 

.000004599 


^40 


.345144240 

IToo = 

.000000001 

/>! 1 0\ 

(Tl = 

3.486784401 

m = 

3.486784401 

0-2 — 

9.663676416 

2 

a = 

0.992795358 


’ This result is derived as follows: (x — a)*’ « (1 + A)* - (—a)*’; E{x ~ a)** « (x ^ a) ^ 
/(») - (S (1 + A)* /(x)y (-a)' - (g (1 + xA + x(x - 1)AV2! + • • ^(x)^ (-a)'. For 


a bivariate distribution it may be shown similarly that, symbolically, E{(x — a)’‘(y — 6)*) 
■■ |exp( 0 ^i. Ai + Aj)) • (— a)*‘(— 6)* where « <rmn and Ai operates only on a and At 

operates only on 6. A similar result may be derived for a multivariate distribution* 

• cf. Whittaker A Robinson, op. cii. p. 7. 
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The observed distribution was obtained by distributing 200 sets of ten digits 
each, the digits being found in Tippet’s lUndom Sampling Numbers.' The 
results obtained are given in Fig. 2. Three of the 200 observed sets were 
illustrated in section 3. 

The agreement between observed results and theoretical values is gratifying. 


5. Distribution of the number of events which occur once eadi. Let vjti , 
represent the probability that there are k events which occur once each. Thus, 
the various probabilities, obtained by rearranging the terms of the expansion of 
(pi + Ps + • • • + Pn)*^, are as follows: 


ITol 


ITll 




m 


ail! • • • a:*! 
N\ 

X2! 


(5.1) 


pV-'P^n, Xi + Xi + +Xn’= N, noa:=l; 

= P? ■■!>:•+■■•+ p. 2: P" • • ■ p w. 

Xi + X2+ • • • + = iV* — 1 , etc., no x = 1 ; 

ITkl = PlPi • • • Pt 2 f— i Pt+l' • • • Pn” + • ■ • + Pn-k+1 • • • Pn 

Xk+l\ • • • Xnl 

4 - 0:2 + • • • + Xn^k ^ N — etc., no a: = 1 ; 


No. of events 
not occurring 

X 

Observed 

frequency 

/ 

Theoretical 

frequency 

xf 

1)/ 

Observed 

parameters 

0 

0 

0.08 

0 

0 

ffi = 3.46 

1 

8 

3.26 

8 

0 

02 = 9.61 

2 

22 

27.22 

44 

44 

X = 3.46 

3 

72 

71.12 

216 

432 

s' = 1.0984 

4 i 

72 

69.02 

288 

864 

Theoretical 

6 

21 

25.72 

105 

420 

Parameters 

6 

4 

3.44 

24 

120 

= 3.49 

7 

1 

0.14 

7 

42 

ff 2 = 9.66 

8 

0 

0.00 

0 

0 

m = 3.49 

9 

0 

0.00 

0 

0 

ff = 0.99 


200 

200.00 

692 

1922 



Fiq. 2 


• L. H. C. Tippet, Random Sampling Numbers, Tracts for Computers, No. XV (1927), 
liondon. 
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In view of (2.21) and (2.27), it is found that (5.1) becomes 

Voi 


( 6 . 2 ) 


Tu 


1 - jv p<(i - Pi)‘'~^ + PiPlii - Pi - PiY ~^ — 
p.(i - PiY~" -(N- 1) P<P/(1 -Pi- P.Y'^ + • ’ •} 


«r»i = 


mN - 1) 
2 ! 


it 


PiPid - Pi -Pi) 


Jsr-2 


■} 


(t j, etc.) 


From (5.2) there is readily derived the fact that 
<rr = N{N -1) ■■■ (N -r + 1) 

(5.3) » 

PoPft • • • Pr(l -Po — Pb- 

a,b,’ ' ••r—l 


py~^, (o 7 ^ b, etc.) 


For the case in which pi = P 2 = * • • ~ ^ distribution in (6.2) 


becomes 


(5.4) 


Toi = Q'^/iCn, N) 

= nNfiin -1,N -1) 

-©■ 


irti 


TTrl = 


n(n - \)N{N - 1) 


/i(n - 2, N- 2) 


'lY/nu.r^ 


\Y%(n -r,N -r) 


where /i(n, N) and have been defined in section 2. For this case (5.3) 
becomes 


(5.5) <7r = - rf-'/Y 

Evaluation of (5.4) and (5.5) for n = JV = 10 yields, 


(5.6) 


fvoi = .00811639 
vu = .04794633 
V21 = .14082336 
ir,i = .21089376 


T« = .27052704 
VH = .15621984 
jrei = .12700800 
rn = .02177280 


V8i = .01632960 

wti = .00000000“ 

Tioi “ .00036288 


,, ffi = 3.87420489 
<T, = 13.58954496 


m = 3.87420489 
ff = 2.45428632 


10 For the case n * iV' = 10 there cannot be 9 events occurring once each, since then the 
tenth event must also occur once. 



^ SOLOUOK XmXBACK 

The observed distribution, given in Fig. 3, was obtained from the 200 sets 
previously considered. 

The agreement between the observed results and theoretical values is 
gratifying. 

6. Distribution of the number of events which occur r times each. Let 
IT*, represent the probability that there are k events occurring r times each. 
Thus, the various probabilities, obtained by rearranging the terms of the ex- 
pansion of (pi H- p* -I- • • • -f- p»)^, are as follows: 


No. of events 
occurring 
once each 

X 

Observed 

frequency 

Theoretical 

frequency 

xf 

*(*-!)/ 

Observed 

parameters 

0 

1 

1.62 

0 

0 

3 . 905 

1 

10 

9.58 

10 

0 

9t = 14.000 

2 

30 

28.16 

60 

60 

£ = 3.905 

3 

37 

42.18 

111 

222 

8* = 2.656 

4 

62 

54.10 

248 

744 

Theoretical 

5 

27 

31.24 

135 

540 

Parameters 

6 

22 

25.40 

132 

660 

<ri = 3.874 

7 

3 

4.36 

21 

126 

= 13.590 

' 8 

8 

3.26 

64 

448 

TO = 3.874 

9 

0 

0.00 

0 

0 

<r* = 2.454 

10 

0 

0.08 

0 

0 



200 

199.98 

781 

2800 



Fia. 3 


JTOr 




N\ 


Ii! • • • Xn! 

Pit- Nl 


p** • • * Pn", xi -I- X* + • • • -I- x„ = iV, no X = r; 
p; X- Nl 


“ r! ^x,! • • • Xn!^* • • • P» + • ■ • + • • • P»-i. 

Xi + X 2 + ••• -|- Xn -1 = JV — r, etc., no x = r; 


ritr 


PiPi Pk 
(r!)* 


Nl 


Xk+\l 




+ 


xj 

p»-t+l 


(r!)* 


• P» + 


Nl 


Xi! • • • Xn-^fc! 


PV 




2:1 + **+ • • • + Xn-k — N — kr, etc., no x = r; 


( 6 . 1 ) 
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In view of (2.21) and (2.27) it is found that (6.1) becomes 

T<r) n A7(8»') ** 


( 6 . 2 ) 


%tW n AT''*’’' ** 

= 1 - ^ g - pr- + ,5, Pim -p.- - • • • 

'i. = ^ p<U - ?()*■' - ^ pIpIO -p< - — I 


ITtr 


jv«'> r 


2 ! (r!)* 




(f 7^ j, etc.) 


From ( 6 . 2 ) there is readily derived the fact that 


(6.3) ffk = ^7^ £ plpl'” pin - Pa -Pb Pk)" (o 5^ 6, etc.) 

V ly a, hr 

For r == 0,1 (6.2) and (6.3) reduce to the values previously derived. 


For the case in which pi = p 2 = 
becomes 


= Pn = the distribution in ( 6 . 2 ) 
n 


(6.4) 


TOr 


rrir 


= (0 




where fr(n, N) has been defined in section 2. For this case (6.3) becomes 
(6.5) <7* = lV'*''n“^(n - 

7. Simultaneous distribution of the number of events not occurring} and of 
the number of events occurring once each. The probabilities for the simul* 
taneous occurrence of the various combinations of the number of events not 
occurring, and of the number of events occurring once each, are given by rear- 
ranging the terms of the expansion of (pi + P 2 + • • • + Pn)^} and are given 
as in Fig. 4. 

In Fig. 4 none of the subscripts take on equal values simultaneously, and 0^% 
has been defined in section 2. Summation of the values in the fc-th column 
of Fig. 4, yields the probability that there are (fc — 1) events not occurring. 
Comparison with (4.2) yields 

n 

Fo(n, N,pi,pir ' • ,Pa) = Gk(n,N) = (?oi(n, N) + N'^ PiGmin -1,N -l,pi) 
AT- A 

+ - 57 - 2 PiPiGoiin - 2, N -2, Pi, Pi) +••• , (i 9^ j, etc.) 



BOIiOlION %WJBACm. 


liO 



Number of eirvnit not obeurriiif 

0 

1 

f 

Number of events occurring once each 

0 

Chi(n,N) 


... 

1 

Nj^PiO,n(n-~l,N-l,pi) 

^ 2 

(n — 2, AT — 1, Pi, p,) 

... 

2 

N<*) A - 

<. 7-1 

(n - 2, N - 2, 

(n-3,N -2,pi,pi,pk) 

... 

$ 


i 

Ar<*> A 

o,b.* • **.p— 1 

PoPfc • • • PtOoi(n — f — 

AT -a, p., •••,p,, 
P«» ••• 1 Pp) 


Fig. 4 


Summation of the values in the A-th row of Fig. 4, pelds the probability 
that there are (fc — 1) events occurring once each. Comparison with (6.2) 
and (2.27) yields 

n 

Fi(n, AT, pi, p 2 , • • • , p») = Giin, N) = Gei{n, N) + '^ Gn(.n - 1,N, pi) 

+ ii 2 - 2, N, Pi, Pi) + • • • , etc.) 

t.j-1 

If we use X to represent the number of events not occurring, and y the number 
of events occurring once each, then it is found that 

Eix^'^ y^’^) - <Tr, = X PoP6 • • • p.(l - Pa 

“,b,* • • ’.p""! 


(7.2) 


(7.3) 


P. 


- p« - • • • - p,)" *, (a b, etc.). 

If o*« represents the average number of events not occurring, when there 
are k events occurring once each, then from Fig. 4 there is found that 

i: Goi(n - 1, N, Pi) + 2 Z Goiin - 2, N, p.-, p,)/2! 

n 

+ 32 Goiin - 3, N, Pi , Pi, pfc)/3! + • • • 



Goiin, iV) + 2 Gnin - 1,N, pi) 

t-1 

n 

+ 2 Gn(n - 2, N, Pi, Pi) /2l H 


(7.4) o«oi = 


(i 9^ j, etc.) 






blsnUBimONS DSBIVSI) isou icultinouial distbibittion 


141 


In view of (7.2), (7.4) reduces to 


ofoi = (g (7i(n, N, po) / Gx(n, N) 


A similar procedure, 3 rields, in general 

n 

^ PaPb • • • PhGiin — k — 1,N — k,pa,pb, • • ■ ,Pk,pi) 

(7.6) tiki ^ 

PaPb • • • PkGlin — k,N — k,Pa,Pkf • • • , Pk) 

0,6,* “,*—l 

(a 5 ^ 6, etc.) 

If iVko represents the average number of events occurring once each, when 
there are k events not occurring, then from Fig. 4, there is found that 

pMn Pi) + 2{N - 1 ) 


(t 5^ j, etc.) 


]C PiPiGnin -2,N -2, p<, p,)/2! + • • • 

(7.7) iStt = 

Gti(n, N) + N'^ piGtiin -i,N- 1, p.) 


+ £ PiPiGnin -2,N — 2, pt, p,)/2! 


In view of (7.1), (7.7) reduces to 

(7.8) ViGtin - 1, AT - 1, p,)) / (7o(n, JV) 

A similar procedure, yields, in general 

n 

N paGt{n — k — l,N — \,pa,Pb,---,Pk,Pi) 

(7.9) ig*o = (o h, etc.) 

52 Gain — k, N,pa,Pb, • • ■ , Pk) 

a,h,' • ’.fc— 1 

For the case in which pi = p 2 = • • • = Pn = - , as may be found from Fig. 4, 

n 

the probability for the simultaneous occurrence of r events not occurring, and 
8 events occurring once each, is given by 


(7.10) (-j /oi(n -r-s,N-s) 

For this case (7.1), (7.2), (7.3), (7.6), and (7.9) yield respectively 

(7.11) - 1, iV - 1) + (2)^"yoi(»^ - 2, 

AT - 2) + 

(7.12) /i(n, N) = /oi(n, IV) + n/oi(n - 1, AT) + ^/^(n - 2, AT) + • • • 

(7.13) <r„ = - r - 8)''-’/n^ 


For this case (7.1), (7.2), (7.3), (7.6), and (7.9) yield respectively 


(7.11) 
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m 


<7.14) oii« - (n - k)fi(n - k - 1, N - k)/fi(n - k, N - k) 
<7.16) = JV(n - k)Mn -k-l,N - l)//,(n - N) 


Let us consider again the case when pi = p* = • • • = 
^Evaluating (7.14) and (7.16) by means of (2.15) yields 


= p» = - and n = N = 10. 
n 


<7.16) 


(7.17) 


The 200 sets of observations already considered 3rielded the simultaneous 
distribution given in Fig. 6. 
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The <hfitribution in Fig. 6 yields ffu = 11.89, (7.13) yields tru = 12.07969552. 

The agreement between the observed results in Fig. 5 and the theoretical 
values in (7.16) and (7.17) is gratifying. 

8. Simultaneous distribution of tiie number of events which occur r times 
each, and of the number of events which occur a times each. The probabilities 
for the simultaneous occurrence of the various combinations of the niunber of 
events which occur r times each, and of the number of events which occur « 
times each, are obtained by rearranging the terms of the expansion of (pi + p* 
+ • • • + Pn)’'. If iTkr.it is the probability for the simultaneous occurrence of 
k events which occur r times each and I events which occur s times each, then 

jyCAsr+t#) 

(8.1) ""A:!l!(r !)*(«!)' 

in - k - l,N - kr - la,pa, ,Pk,Pa, • , Px), (o 5^ b, etc.) 




-.X-l 


PkPa 


piOr. 


where G,, is defined in section 2. 

From (8.1) and (6.2), there is derived, in a manner similar to the derivation 
of (7.1) and (7.2), the result that 


<t) n 


( 8 . 2 ) 


Frin, PI, • • • , Pn) = Grin, N) = Gr.in, iV) + ^ E PiGrXn - 1, N - a, pd 

81 *-i 


N 


(2») n 


2! (»!)* 


E Pi P‘ Gr.in - 2, AT - 2«, p,-, p/) + • • • , (t 5»^ j, etc.) 


and a similar result by interchanging r and s in (8.2). 
For the distribution given by (8.1), it is found that 


(8.3) “ OW? ..6.- • .Sir -.X-l 


_r r t t 

Pa-- - PkPa • • • Px 


(1 - Pa - • • • - Pk- Pa — 


Px)" *“ *, iar^ b, etc.) 


If rXi, represents the average number of events which occur r times each 
when there are I events which occur s times each, then from (8.1) and (8.2), 
in a manner similar to the derivation of (7.6), it is found that 


rit, 

(8.4) 


iN-laY'^ E plpa---p\G,in-l-l,N-r-ls,pa,Pa,--- ,Pk) 



n 

r! E Pa---p\G.in-l,N -l$,pa,--- ,Px) 


(a ^ /3, etc.) 
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If ^kr repTesents the average niimber of events which occur s times each 
when there ftre k events which occur r times each, then by interchanging h and I, 
and r and s in (8.4), there is found 


(Af-fcr)'*’ S pl-"PkP‘cO,{n-k-l,N-kr-8,pa,‘‘'}Pk,Pa) 

a.* • •X«*l 


(8.5) 


£ pl‘--PkGr(n-k,N -kr,pa,-“ tPk) 


(o b, ^c.) 


For the case when pi * pj = • • • = p„ * it is found that (8.1), (8.2), 

n 

(8.3), (8.4), and (8.5) respectively sdeld 

Q H (k+l)-f^lkr+t$) 

k\l\(r\)'‘(8\y^'’^”' “ * “ ^ ^ 


(#) 


(8.7) 


Mn, N) = /„(n, N) + ^ /„(n -l,N-8) 


2! (s!)* ^ 


(8.8) <r« = -k- 0''~*'"'7(r.')* (s!)' n'" 

(8.9) s= (n — 0(.A^ ~ IsY'^f.in — l—l,N — r — U)/r\f,{n — l,N — h) 

(8.10) ,^kr = (n — k)iN — krY*^fr{n — k — 1,N — kr — 8)/a\fr{n — k,N — kr) 


For r = 0, s = 1, the results derived in this section of course reduce to those 
already derived in section 7. 


9. Conclusioii. It is clear that the same method of procedure may be em- 
ployed to study the simultaneous distribution of the number of events which 
occur r, 8, • • • , t, times each. However we will not continue the discussion 
any further. 

We have thus seen that the multinomial distribution serves as the back- 
ground for the study of a number of distributions which have certain practical 
applications. 

The theory discussed herein has been illustrated by several examples which 
3rielded gratif 3 ring agreement between observed and theoretical results. 


Washington, D. C. 



A PROBLEM IN LEAST SQUARES 
By Jan K. WifiNiEwsKi 

§1. We are dealing with two variables, the observed values of which are 
denoted x and y respectively. The pairs of observations are divided into r 
groups, numbering ni, n*, • • • n, pairs. Suppose in each group we determine a 
regression equation of the following shape: 

t/i = Oi -h biX + • • • ntiX* (1) 

where y,- denotes the value of the “dependent” variable obtained from the 
regression equation, while y without any subscript denotes its observed value. 
The r regression equations of type (1) are not assumed independent; on the 
contrary, we postulate that 

r 

=• Oo + M + • • • wiox’ (2) 

1 

be fulfilled identically in x; oo, 6o, • • • m© being predetermined numbers. This 
leads to the following conditions: 

r r r 

= oo 2 6i = 6o • • • ]C = TWO. (3) 

1 1 1 

The magnitude to be minimized under the theory of least squares is now 

z =r 2 [y - (a.- + biX + niiX’)? + Xr - Z) 

X + • • • — ^rrii 

The normal equations derived from (4) are of the following shape: 




nfii + n, £ Oi • + £j X. + (£r *) + • ••»», £, x* 

X ) = y ““ Zjf y *1“ "1“ bo x “i“ * * wio Z^** x 


(5) 
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<*/ 2/ » + (s (Hr x) + bi £# «* + (s (Sr X*) 

( r-1 

S Wl« 

+ 6o Sr X* + • • • »lo Sr x*'*'* 

Oy 52/ X* + ^S Oi^ (Sr x’) + bj 52/ a:*'^* + ^S b^ (52r x*^‘) 

+ • • • m, it*' + ^52 (Sr X**) = 52/ x‘y - 52r x'y 

+ Oo Sr X* + 6o Sr X*'*"' + • • • Wlo Sr X*' 

52«' meaning a summation extended over the i-th group. As (1) is of the 
«-th degree, we have (« 4- 1) (r — 1) parameters to determine and as many 
equations, the problem thus being in theory solved.* As to the numerical 
solution, Doolittle’s method or any other may be applied. We do not enter 
at present the question, how much labor would the actual solution require. 

Examples. Allen and Bowley in their book on “Family Expenditure” 
(London, 1935) assume the expenditure on some defined item / to be a linear 
function of the total expenditure e 

f =‘ ke + c. (6) 

Evidently 2k = 1, 2c = 0 (cfr. pp. 10-11). Another example I give in a 
paper on seasonal variation, which appeared in “Economic Studies” III 
(Krak6w). Actual values y of & time series are assumed to be linear functions 
of certain “normal” values x 

y = a + bx (7) 

o and b cha^ng from month to month but constant from year to year. Then 

52 a = 0, 5-?> = 12. 

§2. Methods of solution in special cases. The generally recognized methods 
of solving normal equations become extremely laborious as the product (s + 1) 
— 1) grows large. As a matter of fact, the amount of computer’s work is 
approximately proportional to the cube of the number of parameters to deter- 
mine. Therefore short cuts seem to be indispensable. A most elegant one is 
at our disposal in the special case^ when the values of x in the several groups 

* The remaining s -f 1 parameters Or , 6r, • • • m are, of course, found from (3). 

^ This seems to be realized in Allen and Bowley^s work. 


^ (I2r x*^‘) = 52/ xy - 52r xj/ + Oo 52r i 
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are identical, or, at least, the sums n<, ]C< x, x*, • •• x** are identical 

in i. Instead of (1) we shall write 

yt = Ai -jr BiXi -!-••• MX, ( 8 ) 

where Xi, Xt, • • • are orthogonal polynomials, i.e. such that S XiXj *= 0 
if and only if t In gener^X* = X* + aj|_i X*”‘ + • • • aj , the coefficients 
being rational functions of n, 2 *i 2- • 23 ^ 

The conditions (3) can now be replaced by a set of equivalent ones, viz. 

i,Ai = Ao B, - Mo. ( 9 ) 

1 1 1 

How the actual values of 4o, 5o, • • • M© are found, will be shown in the next 
paragraph. The solution becomes now very easy, as the normal equations 
for the determination of each set of r — 1 parameters are independent, i.e. we 
can calculate the separately, then the etc., the order of solution being 
of no importance. Moreover the shape of the normal equations permits of 
considerable simplification of solution. Suppose we have to determine the 
values of the coeflScients corresponding to Xh* The normal equations are 
now — after certain simplifications — 

2K,-{-Ko + Ko+ ■■■ Kr-X = {ZiXhV - ZrXny) + Ko 

Kx + 2Ko + Ko+ Kr-i = (Z * Xoy - Zr Xoy) + Ko 


Ki + K 2 + Kz 2Kr-~i 


(Zr-iX*y - ZrX^y) + Ko. 


Adding these equations, dividing the sum by r and substracting the quotient 
from the j-th equation, we get 


K,= 


Z,^*2/ 





hV 


-Ko]. 


( 11 ) 


The first member of the right hand side of (11) should be regarded as the 
principal term: this is actually the value we would obtain for Kj^ were this 
coefiicient independent from the other K^s. The second member is a correction 
tenn, the necessary amount of correction being distributed equally among the 
several The simple solution given by (11) is only possible if the sum 
23 Kl is the same for each group. From the definition of Xh we see that it 
is equivalent to sa 5 dng that n<, be identical in i. As 

h increases to 5 , we come to the condition given at the beginning of this para- 
graph. 
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§3. If this condilion is not fulfilled, we can, indeed, replace the power series 
in x by orthogonal polynomials Xh-t, the second subscript being appended 
in order to show that the values of the X polynomials are no more identical 
for the several groups; these polynomials are now orthogonalised separately 
within each group. But we are no more able to predetermine the values of 
.do, Bo, " ' ilfo, as they depend on each other; this will be made clear a little 
later. Therefore we have to resort to an approximation: the values of the 
parameters will not be found from simultaneous equations, but successively, 
step by step, beginning with those corresponding to the highest degree of the 
independent variable. 

The values of oo, 60, • • ■ wo are given. It is evident that mo — Mo. The 
j-th normal equation is now: 

Mi E, Xli - Mo T.rXU + (2 (LrXlr) - - ErX-/?/. (12) 

W« Bfip ftt, onefi that, 


Mi = 


Mi'ZiXli + 'LiX.-iV - E,X .,1 






(13) 


Inserting this into /12/ we get 

HiX.-iV 


Mi = 


HiX. iV 

Zj ~ 


^iXli HiXli 


(14) 


The second member of the right hand side of /14/ is again a correction term, 
the necessary amount of correction being distributed in inverse proportion to 
Now we determine the value of Lo, this coefficient corresponding 
to s — 1, the second highest degree of x, and calculate the several Us from 
equations strictly analogous to (14) thus accomplishing the second step of our 
work, and so on, down to the A^s. Lo is found from the following equation: 

Lo = lo - E [a.*-i(0 • M.]. (15) 

1 

To aj-i is now appended a bracketed i, this to stress its variation from group 
to group. We see from (15) that before the several M’s are calculated we are 
not in a position to determine Lo. On the other hand, if aj-i is the same for 
all groups, the second member of the right hand side of (15) simply reduces 
to aj-i-mo and Lo can be determined in advance, i.e. before calculating the 
M’s. This is the case treated first (in §2). In any case, if no definite corre- 
lation Ls to be expected between and M<, the approximative method 

developed here should give very nearly correct results. The writer applied 
this method of solution to the simple problem of seasonal variation mentioned 
in §1 and found the results very satisfactory. 



A SIGNIFICANCE TEST FOR COMPONENT ANALYSIS 
By Paul G. Hoel 
1. Introduction 

During the last few years several papers and books have been written on 
various aspects of what has been termed component or factor analysis. This 
analysis has arisen from the psychological problem of describing the results on a 
series of tests in terms of a few distinct abilities or components. In much of 
such work it is claimed that there does not exist more than a certain number 
of components, the material discarded in order to substantiate such a claim 
being considered as due to random errors of sampling or errors of measurement. 
However, mere inspection of results or the calculation of standard errors of 
residual correlations is hardly sufficient to justify such conclusions, and there- 
fore a significance test of some kind is necessary. Hotelling^ considered such 
a test but based it upon an uncertain analogy with the analysis of variance 
and upon the legitimacy of using standard errors. The purpose of this paper 
is to derive a test which is more general in scope and in which all assumptions 
are explicitly stated. 

If each test score is thought of as being made up of two parts, a true score 
and an error element, the assumption that there exists fewer components than 
the number of tests implies that the scatter diagram of the true scores will lie 
in a space of correspondingly smaller dimensionality. Consequently, an ideal 
test for the number of components would be one which would test the rank 
of the true moment matrix. In the case of normally distributed variables, 
this line of approach leads one to the sampling distribution of the generalized 
variance. Unfortunately, this distribution appears in unintegrated form; how- 
ever, by considering its moments it is possible to find a good approximation 
to this exact distribution for samples which are not too small. 

The paper proceeds by first finding two approximation distributions for the 
generalized variance, one for samples which are not too small and one for large 
samples. It then considers the type of population from which it will be assumed 
the sample was drawn, and finally applies the test to two numerical examples 
from recent literature along such lines. 

2. Approximation Distributions 

Suppose that N individuals have been drawn at random from an n variate 
normal population whose distribution is expressed by 

n 

(1) Jp(xi, x», • • • , Xn) = Ke * 

* Harold Hotelling, Analysis of a Complex of Statistical Variables into Principal Com- 
ponents, The Journal of Educational Psychology, September and October, 1933, pp. 21-26. 
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where Xi = Xi — mi, An = y— - , A is the determinant ] p<,- 1 and A<,- is the 

cofactor of pn in A, and K = | iV(2T)”^*. If the observed values of the 

variables of the ath individual are denoted by X»a(t « 1, 2, • • • , n), then the 

j If 

generalized sample variance is defined as z = 1 [ , where a,y = ^ 2 

iV a-l 

(Xja — -fy). Wilks* has shown that in sampling from the population (1), 
the fcth moment of the sampling distribution of z is given by 

where -4 = ^" | Ai, | . An inspection of the integrated form of the distribution 
of z in the case of n = 1 and n = 2 suggests that there likely exists a function 
of similar form for higher values of n whose fcth moment can be made to differ 
from Mk only in higher powers of terms which contain as a factor. An 
investigation along such lines leads to the function 

(2) g(z) = Cz”'e-’'^ 

N-n 

, ^ a * n ^ N — n — 2 . , ,(n — l)(n — 2) 

.where C = ■ , m = , o = and g = 1 - ^ 




It will be shown that the fcth moment Mk of g{z) differs from Mk only in terms 
of magnitude less than the second and higher powers of k^n/N or kn^/N, 
Multiplying g{z) by 2 * and integrating over the entire range of z will yield 
Mk , which turns out to be 




o%”*r( 


Upon reducing the upper gamma function and performing successive steps of 
simple algebra 


Ml = n 


- 0 (« 


N — n + 2k 


kci-nk(^ , 2k - n - 2/nVi . 2k - n - 4/n' 
2 yi + ^ ^^^1 + 






A 2k — n — 2kn/n\ 
N /• 


* S. S, Wilks, Certain Generalizations in the Analysis of Variance, Biometrika, Vol. 
XXIV, 1923, p. 477. 
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The terms in parentheses may be treated as the factored form of a polynomial 

of the nfcth degree in unity. Thus the quantities — , etc., may be 

treated as the zeros with signs- changed of the corresponding polynomial in 
X (say). As a result, the successive terms after the first in the non-factored 
form of this polynomial in unity are the sums of the products of these quantities 
taken one at a time, two at a time, etc. Upon performing this multiplication 
and letting ^ = iV’*/2"A, Mi assumes the form 

where the neglected terms are in magnitude less than the second and higher 
powers of k%/N or kn^/N. If Mt is handled in exactly the same manner, it 
will be found that 

- 1 ) . . . . . . 

^ ^ + 2k — n 

- + ?^) . . . (i - . 

, nkin - 2* + 3) , 1 

= + ...J 



where the neglected terms arc of the same order of magnitude as those neglected 
in the approximation to Ml . Before a comparison of Mk and Ml is possible, 
the factor q~^ of Ml must be expanded and multiplied into the quantity in 
brackets. This operation yields the result 


M i = / 1 - 


nfc(n - 2fc + 3) 
2N 



Thus Mk and Mi agree to within neglected terms. A.s a matter of fact, if 
the values of the neglected terms are considered more carefully, it will be found 
that the actual difference between M* and Mi is considerably less than the 
given upper bound for the magnitude of neglected terms would indicate. For 
example, when n — 5 the first term in the difference is 6fc(A: — .9)N~^, while 
625A:*JV~* or 25k*N~^ is the upper bound for this term when only general results 
are used. The general formula for the first term in this difference has been 
obtained, but since the remaining terms have not been investigated and since 
the type of problems to which the distribution g{z) is to be applied does not 
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justify this refinement, it will not be considered here. Consequently, if one 
coonders this distribution function as sufficiently determined by its low order 
moments and if one applies ff(z) only to problems in which N is fairly large 
compared with n’, then the function g(z) will give a good approximation to the 
exact sampling distribution of z. Obviously, g(z) is identical with the exact 
distribution for the known cases of n = 1 and n = 2. It is not possible under 
the above expansions to vary the constants in the form of g(z) in such a manner 
as to obtain an approximation whose kth moment will agree with ilf* to within 
still higher powers of comparable terms. 

In order to test whether or not a sample value z = Z can be reasonably 
assumed to have been obtained in random sampling from a population of type 
(1) with fixed A, it is necessary to calculate the probability P of obtaining in 
repeated samples a value of z greater than Z. Thus it is necessary to evalua te 

1 — jf g(z) dz. 

Upon making the substitution x = n^az, and letting p = ” — 1 and 

u = ^ [2n(JV - n)]"*, this 

integral can be reduced to the standard form of the incomplete gamma function. 
Hence P assumes the form 

(3) P = 1 - Hu, p) 

where 

1 f ti\/ p+i 

f^+i) i *"■*’'**• 

In many applications of this distribution it will be found that the values of 
u and p lie beyond the tabled® values of these constants. Consequently, it 
will often be sufficient to use the normal distribution to which the gamma 
distribution tends as N becomes large. This normal distribution will be 
considered next. 

Rather than obtain a normal approximation to g(z) or the gamma function 
to which g{z) reduces after the above transformation, it is more illuminating 
to find the basic descriptive parameters of the exact distribution of z and from 
them obtain a normal approximation. Such a procedure will show how rapidly 
the distribution of z approaches normality with increasing N. By using the 
recurrence formula connecting Mk+i and Mk , which can be found directly from 
the ratio of these two moments, and expressing the necessary moments in 

* K. Pearson, Tables of the Incomplete Gamma Function, Biometric Laboratory (1922), 

Univ. of London. 
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terms of Mi , it can be shown that these basic descriptive parameters are expres* 
sible in expanded form as follows: 

„ .Fi n(n + l) , n(n + l)(n - l)(3n + 2) , "I 


^F, n(n + l) , n(n + 
2 r 2n n(2n* — n + 1) 

= Lit w 


24 AT* 


2(3n - 1)*F, „ (n + l)(5n - 3) 
nN L 2(3n - l)N 

,F, . 4(3n - l)(4n - 1) , 

311 + — + ... 


These values suggest that 




will likely be distributed approximately normally with zero mean and unit 
variance. As a matter of fact, by using the second limit theorem of probability,^ 
it can be shown that the distribution of w approaches normality as N increases 
indefinitely. Hence, for samples in which N is large compared with n*, it 
will be sufficient to compare the value of w arising from a sample z = Z with 
its variance of unity if a test of significance is desired. A better general ap- 
proximation could have been obtained by centering the curve at <l> 

rather than at </>; however, since there is positive skewness and the true mean 
lies between these two values, there might arise some exaggeration in a signifi- 
cance test in doing so because the accuracy of such a test depends upon the 
accuracy of the approximation in the right hand tail of the curve. 

Inspection of (3) and (4) shows that the only population parameter upon 
which these approximation distributions depend is </>. There are no assump- 
tions necessary about the population means, or variances, or covariances, 
except in so far as they may be related when the value of 0 is postulated. This 
means that either (3) or (4) enables one to test whether or not it is reasonable 
to assume that the sample variance z = Z arose in random sampling from some 
normal population with ^ equal to the postulated value. 



3. Population Assumptions 

Consider the set of variables Ui, U 2 , • • • , Wn distributed according to the 
normal law 

n 

- 2 ) 

(5) Piui, «!,•••,«») = Kie * 

* See, for example, Frechet and Shohat, A Proof of the Generalized Second Limit 
Theorem in the Theory of Probability, Transactions of the American Mathematical So- 
ciety, Vol. 33, (1931), p. 533. 
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m 


And ^ set of variables Pi ,vi, ■ ■ ■ , v» distributed acccoding to the nonnal law 

(6) P(si, «*,•••, t)0 = 


where the v’s are uncorrelated with the u’s and with each other, 
distribution of the u’s and v’s is expressed by 


( 7 ) 


•P(Ul ) ‘ ■ j Pn) 


-S »/-!S •<' 

KiC ‘ ‘ 


The joint 


Upon writing down the determinant of the coeflScients of these 2n variables, 
it will become evident that any one of its principal minors of any order can be 
expressed as the product of a principal minor of | bn | with a principal minor of 
I c< I . Since the distributions (5) and (6) are normal, the determinants | bn | 
and I Ci 1 are positive definite; consequently the determinant of the coefficients 
in (7) must also be positive definite. 

Now consider the orthogonal transformation 


Vi 


Ui + Vi 

V~2 ’ 


t = 1, 2, • * • , n 


y,. = iz=n + l,--- ,2n. 

V2 

Since the determinant of the coefficients in (7) is invariant under an orthogonal 
transformation, the resulting distribution of the y’s may be expressed by 

2n 

- 2 ) diiViVj 

(8) P{yif 2(f2, • • • , y2n) = K^e ^ 

where [ 1 is positive definite. 

In order to obtain the distribution of the variables !/i , 2 / 2 , • • • , 2 /n , it is 
necessary to integrate (8) with respect to the variables yn+i , • • • , over 
their range of values. If this integration is performed after the quadratic form 
in the exponent of (8) has been expressed as a sum of squares® with coeflScients 
which are the ratios of principal minors of | d,/ 1 , it will be clear that the inte- 
gration leaves a quadratic form in the exponent which is also positive definite. 
Hence after the transformation Xi = y/2yi{i = 1,2, • • • , n) the distribution 
function of the variables Xi = Ui + Vi{i = 1,2, • • • , n) must be normal and 
may be expressed by (1). Thus it has been shown that if the true parts w, 
of the variables x,- are normally distributed without error and if the error parts 
!;*• are normally distributed but are uncorrelated with the Ui and with each 
other, then the variables x< possess a normal distribution. The advantage of 


* See, for example, Kisser and Traynard, Les Principes de la Statistique Mathematique, 
1933, p. 223. 
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this formulation will jbeetmie evident when the parameter ^ is expressed in 
terms of the parameters of (5) and (6). 

Since the v’s are uncorrelated with the u’s and with each other, the variance 
<rf of X{ is the sum of the variances of and , while the correlation p<,- be- 
tween Xi and Xj may be expressed in terms of the correlation p'a between 
and Uj and the variances m* , m’ , r* , v) of m< , Uj, V{, v,- respectively. These 
relationships are 


(9) 


1 


m; + 


and 


Pii 


/ 

Pii 


Vd + + »'*/<**) 


(»■ i). 


For simplicity of notation let X< = Vi/ix] . Now it is well known* that ^ can 
be expressed in the form 


, 2 2 
<p = 


Pii 


If the values from (9) are inserted in \ pa \ and if the resulting denominators 
of elements are factored out, 0 will assume the form 


where 




2 2 

• • • 


(1 + Xi) • • • 


(1+X„) 


B = 


1 + Xl PlJ • • • pin 
/ 

Pl2 


Pin 1 “H I 

Following the methods of confluence analysis/ B can be expressed as follows: 

n n 

J5 = i2 + 2 Xai2)a( + 2 XaX/ji2)a/5( + * * ’ + XlX* * * * Xn 

a—1 «</3 

where iZ = | Po 1, R)a( is the principal minor of R obtained by deleting row 
and column a, etc. R is the true correlation deteiminant whose rank it is the 
object of this paper to test. If R is assumed to be of rank n -- t, then all 
principal minors containing more than n — t rows vanish and B reduces to 

n 

. R ~ iE!) XajXaj * • • Xo| I2)oia2 * • ’atC * “t~ X 1 X 2 ’ * * Xn • 

The tests (3) and (4) were designed to test hypothetical values of <t> by means 
of the sample Z. Evidently the value of <l> can be postulated by assigning 
hypothetical values to the X^s, the o-'s, and the principal minors of R, 
Assigning values to the X’s does not curtail the degrees of freedom in these 

• S. S. Wilks, loc. cit., p. 477. 

^ Ragnar Frisch, Statistical Confluence Analysis by Means of Complete Regression 
Systems, Oslo, 1934. 
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tests beeause they were derived on the basis of (1) which depends only on 

and p's. The X's do restrict the range of the p's, but not their degrees 
of freedom. 

An inspection of the expression for ^ shows that ^ can be made to assume 
any desired value irregardless of the rank of B by merely assigning the (r's 
properly. It is therefore necessary to make some assumption regarding the 
cr's if the test is to serve the purpose for which it is intended. Here it will be 
sufficient to assume that the product of the population variances may be re- 
placed by the product of the sample variances. This assumption will ordinarily 
be approximately fulfilled for the size samples for which it is legitimate to 
employ (3) or (4) ; consequently this assumption does not restrict the range of 
application of the test. 

To postulate values of the principal minors of R beyond postulating the rank 
of R would introduce hypotheses and restrictions which are irrelevant to the 
fundamental purpose of the test. This difficulty will be avoided by replacing 
all non-vanishing minors of B by their upper bounds of unity. Since this 
will overestimate the value of JS, and hence of <>, the usual significance level of 
.05 may be considered as decisive. Let the value of B when unity is inserted 
for all non-vanishing principal minors be denoted by D. Then 

n 

(10) D s= • * * Xa, -}- * • * + X 1 X 2 * • * Xn • 

« 1 <- • •<«< 

Since 

n ti n 

IX (1 + X,-) = 1-1“ Xa -h 23 Xaj Xa, * * * + XlXt • * * Xn 

1 a—1 ai<aj 

it will often be convenient to write D in the form 

(11) Z) = IX (1 -f- X,) — /l -f- 2 Xa -|- • • • + 23 XaiXaj * * ’ Xa,«, 

1 a-l 

As a consequence of all the above assumptions, 

Z _ I Cij I _ (1 -f- Xi) • • • (1 + Xn) I 

4 , ^ B 

( 12 ) 

^ (1 + Xi) • • • (1 -j- Xn) I I 

where | ] is the sample correlation determinant. 

All the essential material for testing the rank of the true correlation matrix 
is contained in (3), (4), (11), and (12). In summary, the hypothesis to be tested 
and the procedure to follow in performing the test are as follows. 

The population of n variables from which the sample is supposed drawn is 
assumed to be such that (a) the true parts of the variables are normally dis- 
tributed, (b) the error parts are normally distributed but are uncorrelated 
with the true parts and with each other, (c) the product of the variances may 
be replaced by the product of the sample variances, (d) the values of the X's 
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postulated as judged by the accuracy in measurement of the variables, and 
(e) the rank of the true correlation matrix is n — t. 

Given the value | ra \ of the sample correlation determinant, a lower bound 
for the value oi Z/^ is calculated from ( 11 ) and (12). This lower botmd is 
inserted in either (3) or (4), depending on the size of the sample. If (3) is 
used and if P ^ .05, or if (4) is used and w ^ 2, one may conclude, as judged 
by the sample variance, that it is very unlikely that the sample was drawn in 
random sampling from the population specified above. If one has reason to 
believe that the variables are sensibly normal as indicated above and that the 
postulated values of the X’s are quite accurate, then the test shows quite defi- 
nitely that the postulated rank of the true correlation matrix is unsubstantiated 
by the sample, and therefore a higher rank should be tested imtil a non-signifi- 
cant value is obtained. Because a lower bound rather than the value of 
is used, the test can be used on minimum ranks only, and hence a value of 
Z < <l> will not 3 rield a test of significance. However, the test does handle the 
problem for which it was designed and which is of fimdamental interest, and 
that is to see whether or not one is justified in assuming that a sample repre- 
sents only a certain minimum number of components. 

4. Applications 

(a) Hotelling* has used an example taken from other sources to illustrate 
his test on components. In order to compare results, this same example will 
be treated here under the assumptions outlined above. In this example the 
reliability coefficients are given. From the definition of a reliability coefficient 

r,- , it follows at once that r,- = ^ . The population values of the X’s will 

1 -j- A»' 

be set equal to the values obtained from these sample reliability coeflScients. 
The data for this problem are 

|r<y| = .235, N = 140, n = 4, Xi = .087, Xj == .119, X,.= .101, X 4 = .773. 

Assume that the true correlation matrix in the peculation is of rank two, that 
is, that two components are sufficient to describe the results on these tests. 
Since N is large compared with n*, it will be sufficient to use (4). The values 
of (11), (12), and (4) are found to be 

z> = rid + x<) -|l -h EXa) = .294 
? ^ XI (1 + I I _ 2 QQ 

<l> D 

w ^ 4 /^ [1.90 - 1] = 3.76 

f O 


• Loc. cit., p. 16 . 
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^C6 tile standard deviation of to is unity, this value demonstrates clearly 
that the hypothesis of only two components is untenable as judged by the 
sample correlation determinant. If one assumes three components, the test 
will be found to yield a non-significant value. Hence it may be concluded that 
under the hypotheses on which the test is based, the sample does not justify 
the assumption of less than three components. Hotelling’s test indicated the 
necesfflty for two components but was uncertain about the third, the decision 
resting upon a variate value of 1.31 as gainst a standard deviation of unity. 

(b) Thurstone, in his “Vectors of Mind,” considers an example taken from a 
series of fifteen psychological tests. After applying his centroid method to the 
data, he inspects his results and concludes that four components are sufficient 
to account for everything except random errors. It is impossible to test his 
conclusions explicitly as above because the size of the sample is not given and 
the reliability coefficients are not known. Nevertheless, if it is legitimate to 
assume that the sample is sufficiently large to justify the use of this test, in- 
teresting conclusions can be obtained on the assumption that only four com- 
ponents are needed. 

Suppose that which implies that the variance of error is half as large 

as the true sampling variance for each variable. Here (10) is more convenient 
than (11) for computing the value of D. The values of (10) and (12) are 
found to be 

D = „C,(i)“ -H «C*(J)“ -H «Ci(i)“ -h (§)“ = .126 

? > lid. 

0 - .0003' 

Evidently, the value of | r,-,- 1 must lie in the neighborhood of .0003 if the test 
is not to yield a significant result which contradicts the hypothesis. However, 
the correlations in | r,-,- 1 are given to only three decimal places, and therefore 
a legitimate value in the neighborhood of .0003 can not be realized. It is to be 
noted that the postulated values of the X’s are equivalent to postulating that 
all reliability coefficients are equal to f , a value which should be considered as 
unusually low. It would seem reasonable to avoid using material in which the 
variance of error is larger than one-half the variance of random sampling, unless 
the variance of random sampling is exceedingly small. 



CONTRIBUTIONS TO THE THEORY OF COMPARATIVE STATISTICAL 
ANALYSIS, L FUNDAMENTAL THEOREMS OF 
COMPARATIVE ANALYSIS' 

By Wiluam G. Madow 

This is the first of several papers in which there will be presented a general 
approach to the statistical examination of h3T)otheses which are false if any of 
several things are true. Phenomena requiring such a statistical theory are 
investigated quite frequently. As examples may be cited the studies of lag 
correlation in time series, periodogram analysis in geophysics, factor analysis 
in psychology, and analysis into components in agriculture.* 

The theorems of this paper have one purpose: to permit the reduction of the 
distributions by which the hypotheses are to be tested to essentially the joint 
distribution of the statistics which contain the information offered by the data 
concerning the truth or falsity of the things which will negate the hypotheses. 
In order to do this it has been necessary to generalize the theorem of Poincare 
on the probability that at least one of several events occur.* As illustrations 
there are stated, after Theorems III, VI, and IX, generalizations of a distribu- 
tion derived by Jordan, (5) page 109.^ 

In a second paper, we shall give a complete derivation of the joint distribu- 
tions necessary for the applications of the analysis of variance. A reconsidera- 
tion of the Schuster periodogram will be included. In other papers these 
results will be extended to problems arising in the theory of regression, and to 
problems of the distributions of medians, etc. 

The fundamental theorems of comparative analysis are now obtained in such 
a form that they are applicable to problems in the theory of probability no 
matter what the distributions may be. Some special cases of these theorems® 


^ Presented to the American Mathematical Society, March 27, 1937. Research under a 
grant-in-aid from the Carnegie Corporation* of New York. 

* Naturally these techniques are also useful in other branches of science then those in 
which they were first applied. It should be noted that by analysis into components we 
here refer to the work of Fisher, (2), chapter 6. 

* See, Poincar6, (7), page 60. This theorem is attributed to Poincar6 by Jordan, (6), 
and Fr6chet, (3). 

^ This distribution states the probability that in r trials of an experiment which has 
exactly n possible results, these results being mutually exclusive, each of the possible 
results occurs at least once. Jordan’s derivation has been simplified by Fr^chet, (3), 
page 12. 

‘ The theorems are, of course, part of the theory of measure and integration. 
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have been used in connection with the derivation of distributions of positional 
statistics such as the in order of N elements,* and others. 

Let 8 be a collection of elements x, and let A be a set of subsets of 8. Then, 
the axioms which the elements of A are to satisfy are’ 

1. A is a field;* 

II. 8 c A; 

III. To every A t A there is ordered a non-negative real number P(ri); 

IV. P(a) = 1; 

V. If e A and B t A, and AB = 0, then P{A + B) — P{A) -f P{B). 
We shall regard 8 as the set of possible results of an experiment <. By events 
we shall mean elements of A. The complement A ol A with respect to 8 will 
be an element of A if .A is an element of A. A consists of all elements of 8 
which are not elements of A and hence is the event which occurs if and only 
if A does not occur.’ 

Let the subsets of 8 

( 1 ) El , Ei , - ■ • , Ek 

be elements of A. Then, if ai , ot , • • • , ajt is a permutation of 1, 2, • • • , };, 
the set 

( 2 ) EaiEat • • • EajEaj^i • • • 

is an element of A and is the event which occurs whenever all the events 
Eai , Eai , • ■ • , Eaj occur, while none of the events , Pa ,+, , • ■ ■ , Eak 
occur. 

The events (1) are said to be independent if and only if 

(3) • • • p., p,,,, • • • .g j = rt p(g„.) • ri p(g-,) 

F-l 

for all selections of the sets (1) and their complements.^*^ 

Theorem /. The probability that the first j of the k events (1) occur, while the 
remaining k -- j events do not occur y is 


• See, for example, Gumbel, (4). It is noted that Theorems I, II, and III are stated by 
Arne Fisher, (1), page 42, who assumes, however, that the events are independent. 

^ These axioms are stated by Kolmogoroff, (6), page 2. 

• A set of sets is a field if the fact that A and B are elements of the set implies that 
A + B, ABy and A — AB are also elements of the set. 

• The event A will be said to have occurred if the result of the performance of the experi- 
ment E is an element of A, 

See Kolmogoroff, (6), page 9 for a discussion of various equivalent definitions of 
independence. 
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(4) ‘ * * • • • Sk) = ("“1) 53 P{Ei • • • EjBai • • • Ea^. 

' ' r-0 «l, • 

«l<«*<*'*<aF 

Proof. Let A? = j + 1. Then it follows from Axiom V that 

(6) P{EiEt • • • Ej) = P(E\Et • • • EjEj^^ + P{EiEt • • • EjEj^i). 

Hence the theorem is true for fc = j + 1 and any j > 0. Let the theorem be 
true for fc = j + 1, • • • , A; — 1. From Axiom V it follows that 

(6) P{Ei . . . Ei&i^i . . . J^,) 

= P{Ei • • • Ei&i^i • • • — P{Ei • • • Ej&j^i • • • Mk^iEk)> 

Substituting from (4) the theorem is proved. 

Let n > ni + ‘ , n* > 0 (i = 1, • • • , 0 ; let 


n! 

nj n*! • • • n<! (n — wi — • • • — tit ) ! 


(n;ni,ni, • • • , n<). 


Corollary. If, for each value of i', (y = 1, 2, 
terms 


,k - f), the (A: - j; v) 


P{Ei . . . EiE., . . . E.,) 

which can be obtained by selecting ai , a 2 f • • • without repetition from 
i + 1, i + 2, • • • , A;, are all equal, then 

(7) PiEi ■ • . = S (- l)'(fc - j; v)P{E^ • • • £,+,). 


Let 

( 8 ) 


S{v) = £ PiE„,E„ 






where the summation extends over the (fc; v) terms 
(9) P{Ea,Ea, • • • Ea,) 

which can be obtained by selecting v of the k events (1) without repetition. 
If all the terms (9) which can be obtained by selecting v of the k events (1) 
without repetition are equal, then 


(10) 


S{p) = (fc; v)P{Ei . . . Ep). 


“ By definition 

£ (-1)’ i P(E, ■ ■ ■ Ejbi+i ■ ■ ■ £«,) 

F-»0 ail* • sa^—y+l 

«!<•“<« 

P(Ei---E^)+'^i-iy 2 P{Ex--EiEa,---E.,). 

F-»l ai,* • ‘.a^—y+l 

«!<* ' •<«» 
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Theorem 11. The probabtlUy that exactly j of the k evente ( 1 ) oecttr is 

(11) Pw *= ll (- m + y; y)S(J + y). 

Proof. If A a) is the subset of Q defined by the requirement that exactly j 
of the events (1) occur, then Au) is the sum of (k; j) disjunct sets: 

k 

(12) il(,) = 23 Pax • • • EajSax + x ’ ' ' 

«i.* • ‘.tty*! 

where aj+i , • • • , <*» have those of the values 1, • • • ,k which remain after the 
selection of ai , • • • , a# . By Axiom V we may replace A by P in (12). Upon 
substituting from (4) we note that the resulting terms of (12) which depend on 
the same number v, v — j, • • • , k, of events have the same sign, that all /S(v), 
y — j> • • • occur, that no term depending on fewer than j events occurs, 
and that any particular P(P«,P«, • • • Po,+,) will occur in those of the terms 
of (12) the j occurring events of which are a subset of Pa, , Ea, , • ■ • , Po,+, 
and wUl occur in no other term of (12). Hence the coefficient of S(j + t) in 
(11) is (—1)* 0 + 0- This completes the proof of the theorem. 

CoBOLLAKT. If (10) is true for »< = • • • , k, then 

(13) P(0 = £ (-ink;j, v) PiExEt • • • P,+,). 

»-0 

Theorem III, The probability that at least j of the k events (1) occur is 

(14) P'^' = S (-!)■'(; + ^ - 1 : y) S{j + v). 

Proof. If A^'’’ is the subset of 0 defined by the requirement that at least j 
of the events (1) occur, then A^'’* is the sum of A: — j + 1 disjunct sets: 

(15) A^'* = A(,) + A(,+i) + • • • + A^k) . 

By Axiom V we may replace A by P in (15). Substituting from (11) 

(16) P'"’ =£c,5(i + v), 

IThbO 

where 

c, - U + »>; i + »') - (i + v; 1) + • • • + (- l)'(i + y;y), (y = O, •••, k - f). 
It is easy to prove that 

(17) (- l)'(i + . - 1 ; v) = 23 (- l)'-'‘(i + r ; i + m). 

„_o 

CoBOLLABY. If (10) is true lor p = j, ■ • • , k, then 

P^'^ = S (-i)'(y + - i;yKk;j + r)P(PiP* • • • P,+,). 

F-0 


( 18 ) 
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To provide examples illustrating these theorems let us consider r experiments 

(19) • • . , 

Let have k mutually exclusive outcomes 

(20) Oi^, • - ■ , Ol^. 

Then, it is easy to define the spaces A**’ the probability function P<(£?‘^), 
the combinatory product 

0 = o‘“ X n® X • • • X 

the set A and the probability function P{E) so that Axioms I, • • • , V are satis- 
fied and hence Theorems I, II, and III are valid. 

We shall assume that the experiments (19) are independent. 

Let 

0, (j = 1, . • . , *) 

be the event which occurs when neither 0”* nor 0}*’ nor ■ • • nor 0^'^ occur. 
Then 0/ occurs if upon performance of the experiments (19) at least one of 
Oi”, ■■■ , oy^ occur. 

It is an immediate result of the definition of independence that 

(21) P(0a. ( 5 „ • • • ( 5 „,) = n 1 1 - P(oiV) P(OLV) } • 

t—l 


From Theorem I, the probability that 0i , O 2 , • * • ,0, each occur while not 
one of Oj+i , 0 ,+ 2 , ♦ • • , 0 * occurs is 


p{Ox ■ • • oA+i = i (- 1 )' i 


( 22 ) 


F«*0 

oi<- • •<a^ 


n {1 - P(0yt\) - P(OLV) P(0iV)|. 

From Theorem II, the probability that exactly 7 of Oi , O 2 , • • • , 0*: occur is 
(23) P<,-, = i ( - !)'(* -j + y ; y)Sik -; + .), 


F-0 


where 

-S(fc -j + y) 


Z n{i-P( 0 !A’) 

ai.aj.* • + <— 1 


Since the probability that at least j of Oi , Oj , • • • , 0* occur is equal to 1 
minus the probability that at least k — j + 1 of , 0 * , • • • , 0 * occur,** it 
follows at once from Theorem III that 


(24) 


P{at least j of Oi, • ,0k occur} = 

1 - S (- l)'(fc -j + y, y)S{k -j + v+1). 


“ There are, of course, other ways of computing these probabilities. 
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The case treated by Fr6chet and Jordan is that which occurs when we assume 
P(0«**) = P(Oi^), (/ »= 1, • • • , A:), (t, A = 1, • • • , r) and in (24) let j == 1. 

It is not difficult to obtain further generalizations of Jordan’s distribution by 
defining events which occur if and only if fewer than f of r events occur and 
then proceeding as above. 

Certain useful generalizations of Theorems I, II, and III will now be derived. 
Let the subsets of 0 

(25) Ei‘\ Ei‘\ . . . , Ei\l (8 = 1, . . . , p) 

be elements of A, and let N = fc**' + + • • • + 

Let < k^‘\ (s = 1, • • • , p); and let 

( 26 ) = « = i, ...,p), 


Let 

(27) Q'“' = I1 n E\‘^ « = 1, ...,p). 

Furthermore, let for each value of (s = A, • • • , p), the 
possible distinct selections of of the sets 

(28) ElV.y+t,Eyi]y^, ...,El’il 

be arranged in some order, and, if the intersection of the sets of the i,*'’ 
selection be denoted by 


(29) 
let 

(30) 


g’*(/*>) (s==h,... ,p), 

(f. = 1,2, ...,(fc^'> -/'>;.">)), 


There are ]fl sets (30), for each value of /i, (A = 1, • • • , p), 

and any set of fixed values of • • • , 

Let for each value of s, (s = A, • • • , p) the possible distinct selec- 

tions of of the sets 


(31) (i = 1, • • • , 

be arranged in some order, and if the intersection of the sets of the selection 
be denoted by 

(32) g’*(r‘*') 


let 

(33) g** ’V*', = fr 
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There are /*’) sets (33), for each value of A, (A = 1, • • • , p), and any 

•—A 

set of fixed values of !»**’, • • • , v*"’. 

It is clear that the various sets that have been defined are elements of A. 
The fact that the sets are the events which occur if and only if certain sets of 
events occur is also too obvious to require further comment. 

Theorem IV. The probability that of the N events (25) the first of super- 
script 8 occur and the remaining fc'*’ of superscript s do not occur, s — 1, • • • ,p, is 


P(q(p)q(p)') ^ V 

► (U-0 > 




(34) 


(2)^ 


(Jk(l)-.y{l);p(l)) 


E (-1) 

fCpT-O 


r(l)4.r(2)4.. . 


(H;(p)-.J (p);y(p)) 

L •>(.' 


^ ... 2 pr«- 

»1«1 to-l 


)]. 


Proof. Theorem I is a proof of Theorem IV for p = 1. The theorem may 
then be proved either by regarding it as a special case of Theorem I and col- 
lecting terms, or by induction. 

•11 . A 1 A fll f21 

V 


Corollary. If, for each possible set of values of 

ft (A‘*> -/*’;.“’) 

terms 
(35) 

are all equal, then 

P(qWq(p)’) ^ Z ••• Z (-1)'"’+”-" 

yClUiO f(p)— 0 


the 


P[,“ ••‘(p«>, . . . , /">)] 


(36) 


.+fCp) 






Let, for each value of A, (A = 1, • • • , p), 
(37) 


.,(*) ..(*+1) (p)\ 

t • • • f V ) 


= E 

U-l 


(fc<p>;v(p)> 




It is apparent that by using (34) it is possible to obtain an expression for (37) 
which does not depend explicitly on In fact 

-s(A • • • , = Z ••• ..E 


pr (A~" 1 )m0 


( 38 ) 




<1-1 


(jfc(A-l).^y(A~l);,(A“l)) (*(fc);vU)) 

Z E 


E 


<p— 1 


P[g-> •••♦*-*(/“, 






)]. 
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If the different terms of (37) are all equal, then 

If the different terms of (38) are all equal, then 


S(i-‘«, 


..w 


) 


(40) 





- .cA 

fc-1 

n 



P[g*‘ 


(- 1 ) 






Theorem V. The probability that of the N events (25) the first of superscript 
8 occur and the remaining A'*’ do not occur, {s — 1, • • • ,h — 1), and exactly 
events of superscript s occur (s = A, • • • , p), is 


Poa)...,(p),(Q'*-“Q“-"') 


(41) 


t (*)-,■(») 


*(P)— ;(P) 

r(p)M0 


(- 1 ) 




ft (/•’ + /•'; «'<*^)5(/« + 


»""h 


Proof, The theorem may be proved, either by induction using Theorem II, 
or by obtaining disjunct sets as in Theorem II and using Theorem IV. 

Corollary I. If (39) is true for all sets of possible values of • • • , 
then 




(42) 


X) ••• S 

0 r(pT— 0 




•—A 


Corollary II. If (40) is true for all sets of possible values of v , v , 
then 


O) ..(2) „(r) 


*(1)^(1) Jt(p)-.y(p) 

Poa)...,(,„((3«-“<3‘*-“') = i: .. ^ 

p ( 1 )«a0 V (p)aaO 


t (-1)'“ 


)+...+p(p) 


(43) 


a—l *— A 


Theorem VI, The probability that of the N events (25) the first events of 
superscript s occur and the remaining do not occur, — 1, exactly 

events of superscript s occur (s = Ot " • jh — 1), and at least everUs of 
superscript s occur (« = A, • • • , p) is 



THEORY OF COMPARATIVE STATISTICAL ANALYSIS 


167 


t(p)— j(p> 

» X ... Z 

»(07«.O ii(p)liO 


(44) 


n if"' + f") tl (/•’ + 




Proof. The theorem may be proved either by induction using Theorem III 
or by obtaining disjunct sets as in Theorem III and using Theorem V. 
Corollary I. If (39) is true for all sets of possible values of 



/»>, . . . 



then 


fc(p)-y (p) 



p[}uJ;;;5u-’.),((3‘''"q‘''~“') = i 

(_2)v(ff>4-*--+»'(p> 






(45) 

liik^";f"\r^"')tl[if"' + f"' 







Corollary II. If (40) is true for all sets of possible values of v'*’, • 


then 

*(1)^(1) 

jt(p)^(p) 



»(l)-0 

,.(pTlo 


(46) 

fl (fc'*' - f"' ; .-'") il (fc‘*> ; /*>. f) t[ 

[(/*> + /'» -l;/")(fc'*';i'*> 

+ f")] 


P[g‘- 




Let us again consider the experiments (19), and let us assume that 
(i = 1, • • • , r) has as its mutually exclusive results 


(47) 


oi:> 


( 8 = 1 , 2 ). 


Let Ot, be the event which occurs if, upon performance of the experiments 
(19) at least one of the events Oil*, 0{!’, . . . , occur, and let Ot, be the 
event which occurs if and only if 0«, does not occur. 

We may state the probability that the event Ei , which occurs if and only if 
at least of the events On , (< = 1, . • . , A:**’) occur, and the event Et , which 
occurs if and only if at least j”’ of the events 0« , (t = 1, • • • , fc‘*’) occur, both 
occur. 

It is apparent that 

(48) PiEiE,) = 1 - Pi£i) - PiMt) + PiSiSt), 

where Sz is the event which occurs if and only if E, does not occur, (s = 1, 2). 
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From Theorem III 


P{£,) = % + 1) 

y(O-0 


(s = 1, 2), 


where 


*(•) 

-/•’ + /*’ + 1) = £ 

«l.* • •.flfjfeC*)— y(«)4.|»(f 

«i<* •*<«*(•)— I 


IT {1 ~ — ... — , (5 — 1, 2). 


From Theorem VI 


•-1 

~ /^> + + 1, + 1), 

where 

(ifc(l);,(l)_„(l)-.l) (ife(2);,(2)^„(2Ui) 

S(fc(i) /I) + ,<^) + 1 , ^ + 1 ) « 5 : 2 

M-l 12-1 

- /» + + 1, + .<“> + 1)], 

and 


,(i)_i ,(«)_i 


p{£i£i) = E E 

,TiT-o >(^o 


P[g‘-*(ifc«> - i“> + + 1, - /« + + 1)] = 

IIu- E P(oLVi)- Z P(o^‘l)k 

i-l I, V-l M-l J 

the subscripts a, , (y = 1, • • • , A:“’ — + !<“’ + 1), being those of the * 2 “* 

selection of fc**’ — /*’ + v**’ + 1 events from events, and the subscripts 
/3. , (/I = 1, • • • , — i'*’ + I*® + 1), being those of the tV** selection of 

^(8) _ j( 2 ) _j_ |,(» j events from events. 

The desired probability is then obtained by substituting from (49) and (50) 
into (48). The procedure is perfectly general, and applies directly to situations 
in which p > 2. 

We shall now investigate the results obtained by requiring that the events 
considered satisfy a relation of implication. 

Let the subsets of Q 


(61) 

be elements of A, and let 

Eu , Ei, , • • • , Ekt , 

(« = 1, . . 

■ ,p), 

(62) 

Ei. C Ei , , 

(t = 1, • . 


if « < f. 
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It follows that 


(53) P{Ei.Eu) = PiEu), (t = 1, • . • , fc), (» < 0 

Let ji < ji < • • • < ji and let 

(54) Qi^ilflEu, 

Let ji < jt < • ■ • < jt and let 

(55) 


- n n 

n-l t-7,+1 


(t= 1,2, •••,p). 


(56) 


From (52) and (53), it follows that 

PiQ.Q't) = p(\U tt £..] 


is + i "1 ^ \ 

n IT n Eu), (i« = o) « = 1 , 2 , ...,p). 


L^t ii < j 2 < • • • < jp and for each value of (s = 1, • • • , p), consider a 
selection of jg + Ps events of second subscript s from (51). Let the p selections 
thus obtained be such that 

d” ^ 1 (s — - 1 , 2, • ' • , p), (jp-fi “ Aj), 


and if Eu in one of the events of the selection of events of second subscript s 
then the fact that t > s implies that Eu is one of the events of the selection of 
events of second subscript t. 

From (52) and (53), the probability of the occurrence of all the events of the 
p selections thus obtained is a function of jp + Pp events, of which are of 
second subscript s, (s = 1 , • • • , p) where 

(57) fii + 1^2 + • - + fJiu - js + Vsy (s = 1, • • • , p), 

and for a given set of values of ji , ^ 2 , • • * , jp the fis and v, determine one another 
uniquely, (« = 1, ... ,p). 

For a definite set of values of ji , • • • , jp and Mi , • • • , Mp or ji , • • • , jp and 
vi , • • • ,Pp there will be 


(i»+i ~ i*; ^») = (i«+i - js^jni - Ml - • • • ~ M«), (« = • • • , p), (jp+i = fc) 

possible distinct selections of j, + i'll , (s = 1, • • • , p) events of second sub- 
script 5, j» of which are preassigned, from j^^i events, (s = 1, • • • , p). 

Let these s('lections be arranged in some order for each value of s, s = 1, • • • , p, 
and let 

(58) 7»i12 • »p(mi , M2 , • • • , Mp) 

be the event which occurs when for all values of s, (s = 1, • • • , p), the events 
of the f/*' selection of j, + v, events of second subscript s all occur. 

It is understood that the j, preassigned events of second subscript « are among the jt 
preassigned events of second subscript (t > s) in the events (68). 
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A typical event (68) is 

(59) - • • • ) Mp) = ft n (io + >'# — 0)- 

There will be, for a definite j, events of second subscript «, (s = !,•■•,?) 


(60) 


ill (y»+i 


(^p+i ““ 


events such as (58). 

For a definite set of values of , • • • , /Xp there will be, for each value of 

(« = 1, . . • , p) 

(k - M— 1 — • • ‘ — Ml ; M*), (s == 1, 2, • • • , p) 

possible distinct selections of js + events of second subscript s, 
of which are preassigned from k events, (s = 1, • • • , p). 

Let these selections be arranged in some order for each value of s, 

(s = 1, . . . , p), 

and let 

(61) Quii ... »p(mi , M2 , * • ‘ , Mp) 


' be the event which occurs if and only if, for all values of s the events of the 
t/*' set of js + Vs events of second subscript 8 all occur, (s = 1, • • • , p), and 
the first subscripts of the events of the t,**" set of events of second subscript s 
are among the first subscripts of the events of all the selections of events of 
second subscript greater than s, (s == 1, • • • , p). 

There will be 


(62) 


(fc; Ml , M2 I • • • , Mp) 


events (61) which may thus be obtained. 

Theorem VII. The probability that of the pK events (51) the first js events of 
second subscript s occur and the remaining k — js events do not occur ^ s = 1 , • • • , p, 
is 


7 2~7 1 7 i~J 2 li-Jp 

P(.QpQv)= Z Z ••■Z 


(63) 


(72-7 i;yi) (7 3-7 21^2 ) 


<1-1 


32-1 


--7 i ;yi) (73-7 2 ;y2) (A:~yp>u) 

• • .»p(MI 7 M2) * 

tp-0 


, Mp)]) 


where the event Qt determines the js — i.-i — Vs^i events of second subscript 
s, (« = 1, • • • , p), which have as first subscripts all numbers 1, 2, • • • , js which 
are not among the js^i + Vs^i numbers determined by the events of lower second 
subscript than s which are contained in Qi^ ... (mi , • • * , Hp). 

Proof. Expand (56) by means of Theorem IV. 
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CoBOLLABY. If, for each fixed set of values o{ ni, fjn, , Hp the term s (58), 
in number (60), are all equal, then 

/i— ii lj~/j k—jp p 

(64) ri-O i»i*0 Bp-0 s-1 

M*I • • • I Mp)] O’p +1 = *)• 

Let 


(65) 


Tinu m, 


’ y l^p) ^ ^ 

,-,-l <,-i »„-l 


If all the terms of (65) are equal, then 




(66) • ,fJLp) = (fc; fiif fii, • •• , fip)P[qi.i(jJLi, • • • , Mp)]- 


Theorem VIII, The probability that of the pK events (51) exactly j, events of 
second subscript 5 = 1, • • • , p occur j is 


/a ~; i /tii* 

= E £ ••• S (-1)'*^'*^ 




(67) 


*'i“0 ‘'a-'O 


(m#;;. - /ii - ... - Ms-i) r(Mi, M*) • • • > Mp)- 


Proo/. If is the subset of 12 determined by the requirement 

that exactly j, of the events (51) occur (s = 1, • . • , p), then ,^) is the 

sum of 

y jl } ji jl y h “■ i2 , • • • > jp Jp-l) 

disjunct sets which may be obtained by replacing F by A in (56) and forming 
(56) for all selections of occurring events from k — j,^i events, 

(s = 1, . . • , p). By Axiom V, F(,i, is the sum of the probabilities of 
these disjunct sets. 

Substituting from (63), it is noted that all terms (61) which depend on the 
same m» > (« = !» • * • > p), have the same sign and that all Tini , ^ 2 , • • • , Mp) 
for which 

0 < Vs < i.+x - j, , (s = 1, . . . , p), 

appear and only those appear. Furthermore any particular term (61) will 
occur in those of the terms (63) the j, — occurring events of second sub- 
script s, (s == 1, . • . , p), of which contain a fixed p.-i events, the remaining 
js — js^i — Vs^i events being a subset of the events of second subscript s, 
(s = 1, • • • > p), that actually appear in the particular term (63). Hence the 
coeflScient of ^(mi , • • * , Mp) is 

(_i)-.+..+-p f[ 


(mo = 0). 
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C 0 ROLIJI.BT. If (66) is true for all sets of posEdble values ol m , im , • • > , Hp 
then 


■P«i jp) 


= ^ z ••• r (-i)'*+'*+- 


rj—o ri-»0 rj,— 0 


Voo; . . . . . ^ 

VijJt - - Vl] Vf , • • • ,7 p - - Vp^i, Vp) 

- • • • iMp)l- 

Theorem IX, The probability that of the pk events (51) at least , bid not more 
than , events of second subscript s occur, (« = 1, • • • , fl^), and exactly events 
of second subscript s occur, (s = gf + 1, * * • , p) is 

(69) = Z Z • • • Z «(,>.. •.y,)(l, <?*,•••, e,), 

e2—o «j— 0 0 

where, if a 1 in the position is denoted by 6* , (i = 2, • • • , g), 

+ • * * > ^71 > ‘ * J * * * > ^7s » 1 * * ’ > > * ' * I 

;p Jp+ 2 — Jc + l }g + l—3o J74+1~77,— 1 Jg— /l— 1 

= Z--- Z Z--- Z Z •••Z (-i)'>+'*+-+'' 

»'i, + l«0 ^yr’Jyrht >'7a-i"0 »'l-0 

(70) (;i + Pi - 1 ; Pi) • • • {jy, + p-y, - - Py 3~1 ~ 1 ; Py,) 

0*74 “H ““ i'ys ^rs *'74 ) * ' * Op “I" ^'p ~ Jp-i 

r(ii + J'l) • • • ,i7« + ^yz “• ^73-1 “ ^yz-u 0, • • • , 0, 
jyi 4" *'74 jyz ^73 > ’ * * > ip 4" ^'p jp-i >'/>— i)* 

Proof, We note first that there are 2^^^^ terms in (69). Since 

(71) P(5':v;;"V/p) = '£••• z z 

Xp— /(, X 2 — J2 Xi— Ji 

the theorem may be proved by a process of repeated summation. From (67) 
and (71) 

Aj A2 — Xi A8~A2 ^ — ip 

p(/i) - V v y' V +■•••+»': 


Xi"*Ji vi-0 r2""0 »'|J— 0 


(Xi 4“ ^i)(X2 4- ^2 — Xi — vi] P2) • • • (ip 4- "p — ip-i f'p-i; ^p) 

ir(Xi + Vi,\i + Pi — \i — Vi, •■•, jp + Vp — jp-i — Vp-i). 

For fixed values of X2 , Xa , ■ • • , X, there will occur in (72) all terms 

(73) T{ji + |8i , Xj + V* — ji — ft , • • • ,jp + "p — ip-i — >'p-i)> 

(ft = 0, • • • , X* - ji), (fl < V, < X,+i - X.), (s = 2, • • • , p), 

(X»+« ~ jg+, ® ~ 1) • ■ • ) P 0)> 

and any definite term (73) will occur in all 

(74) P{/i+«.x 
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for which 

0 < a < ft . 

In (74), the definite term (73) will have coefficient 

(_l)'’‘-+'*+ ••• +'»(j, + ft -j, + a)(x, + ^ _ ft ; 

(75) • . . Up + Vp - jp-i - Vp-i , Vp), (a = 0, 1, • • • , ft), 

(ft = 0, • • • , X* — ji). 

Hence, in (72) the definite term (73), will have coefficient 

(_l)^>+'«+ ... ij _ ft ; V2) 

••• Up + Vp - jp-i - vp-i ; Vp), 


(76) 

We now evaluate 




P CjiJa) p(ji) 

(Xs*--7p) ~ 


For any fixed values of Xa , ... , X, , there will occur in (77) all terms 
TUl + dl I is + ft — il — ) ^3 + «’s — J 2 — ft , 

• • • , ip “t" •'p ip— 1 •'p-i)» 

for which either 0 < ft < Xs — i* ; 0 < ft < is — ji — 1 or ft = is — ji + y, 
0 < Y < Xs — is ; 0 < ft < Xa — is — 7. 

Let 0 < ft < is — ii — 1 ; 0 < ft < Xa — is . Then the term (78) will occur 
in all 


i 2 +«.X 8 ,- • ', 7 p)f 


such that 


0 < a < ft . 

In (79), (78) will have coefficient 

/Qn^ (-l)'''+'’*-“+'‘+ -^''’(i, + ft - 1; ft)(is + ft - ii - ft - 1; ft - a) 
(X3 + J'a — ^2 — ft ; J'a) • • • Up + J'p — jp-i ~ I'p-i 5 J'p)- 
Hence in (77), (78) will have coefficient 

+ ft - 1; ft)(is + ft - ii - ft - 1; ft) 

(Xa + J'a - is - ft : •'a) • • • (ip + »'p - ip-i - »'p-i i i'p), 

(ft = 0, ... , is - ii - 1), (ft = 0, . . . , Xa - is), 

(p« = 0, . • . , X(^.i — Xt), (s = 3, . • . , p) J 

(^p+t ~ i»+»)» (* ~ 1) ■ ■ ■ > p “ 


( 81 ) 



m 


imUAH O. IIAIKIW 


Now let — — Then the 

tem (78) will occur in all terms (79) such that 


y < ot < 0i, 

and in (79), (78) will have coefficient (80). Summing for a, (a = y, ••• , ft), 
we obtain as the coefficient of (78) in (77) 


and 


Hence 

(82) 


0 , 


if ft > 7, 


+ A - 1; A)(X. + V, - ji - A ; 1^) 

• • • Up + V, - Jp_i - i-p.! ; V,), if A = -y. 

P[i't" ip) = ^<x, - f,)(l, 1) + •/,)(!, 0). 


If we examine (82), we note that the result of summing with respect to X* 
has been the replacement of (76) by two sums which are similar to (76) in that 
the next summation index, in this case X} , occurs in exactly two limits of sum- 
mation. If it can be shown that the two sums which occur in (82) each result 
in a pair of sums after summation with respect to Xs , or more exactly if 


Xs+8 

(83) 


yOs) 


* * * > l) *4" + ^2> ••*>^•>0) 


then the proof will be completed. 

Since the truth of (83) may be demonstrated in exactly the same way in 
which (82) has been shown to be true, the theorem is proved. 

CoROLLABT. If (66) is true for all sets of possible values of /xi , m* , • • • , /Xp 
then 


■®0'p + l» • • > 0| * * ■ > 0, * ’ • j , 0, • • • 0, • • • , * * * ) O 

;p /fif+i-jp+i J74+i*”77»-i n-Ji-i 

= Z--- S E ••• E ••• E (-D’-^--^'' 

^p-O »'a + l-0 *'7,-774~?7» 

(ji + Pi - 1 ; »'i) • • • (Jt. + »'r. - ir.-i - ^.-I - 1 ; »'ti) 

(84) Un "i" Pt4 jyi *'ti 1 > ^ 74 ) ■ ■ ■ Up “I" Pp jp—i Pp — 1 j Pp) 

(fc;ix + Pi, • • • ,h, + Pri — jy,-i — P7>-I>i74 

+ *'74 - jyt - Vy,, ••• ,jp + Vp- jp-i - Pp_l) 

f*[?l -l(il + Pl, • • • ,jy, + Vy, -jy,-l - Py,-1, 0, • • • , 0, 

jyt + P74 - ht - P7. > ■ ■ • > Jp + >'p - ip-i ~ >'p-i)l- 
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Let us again consider the experiments (19) and let have as possiUe results 

Oil’ (i=l, (8 = 1,2) (t = 1,2, 

Let 

0<‘> 3 0^’’ (» ■= 1. • • • » >•). 

i.e. oil’ occurs whenever OH’ occurs. Furthermore let the outcomes 

0}r,0lJ’, •••,011’ 

be mutually exclusive. 

Let 

Oj, , 

occur if and only if none of 

0‘*’ ••• 0^*’ 


(8 = 1, 2), 


occur. 

We may wish to know the probability that at least ji of On , ••• , Ott and 
at least jt , jt > ji , of On , On , On occur. 

From Theorem IX this probability is equal to 

(85) = Ril, 1) + fl(l, 0), 


where 

P(l, 1) = “S * (-l)’‘^'’(ii + .'1 - 1; n) 

Fj-«0 Fi—O 

(ji + I** — ji — n — 1 : n)T{ji + vi,jt + Vi — ji — vi), 

and 

B(l, 0) = ^ (— ll'Hii + vi — 1; vi)T{ji + vi). 

From (63) 

(86) Tiji + VI , jt + vj — ji — vi) = 2 2 

+ •'ijji + Vj — ji — Vi)], 

where, from (61) 

iilitUl + ^1| 32 + ^2 — jl — n) = 11 IT f 

F*«l f »»7 i 4* fi 4-1 

the subscripts 

(87) Oil, at, • • • , 
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bdng the first subscripts of the ti*'' selection of ji + n events of second sub- 
script 1 frpm 

Oil, On , • • • ,0ki, 

and the subscripts 


» • * * > 9 

being the first subscripts of the tV^^selection of ji + Pi events of second subscript 2, 
ji + vi of which are (87), from 

On f On f • * * 9 Oki . 

It is easy to see that 

PliilitUl + ^1} h + J'a Jl ““ >^l)] = II ^ 1 — S ^^(OLVl) "" iC P(Pa^i) 

Furthermore 

(88) T{ji + Pi) = ^ PlQiiiji + 

M-l 

where 

^ r . 'i 

PlQiiiji + vi)] — n < 1 — 2 -p(oLVi) f* 

imml ^ j 

Substituting from (86) and (88) into (85) the desired probability is obtained. 
It may be remarked that theorems which have the same relation to Theorems 
VII, VIII, and IX that Theorems IV, V, and VI have to Theorems I, II, and 
III may be obtained without much diflBculty. 
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REPLY TO MR. WERTHEIMER’S PAPER 

Richmond T. Zoch 


The attainment of rigor both in applied as well as pure mathematics is a slow 
process, and for this reason criticism of my paper, if constructive, is welcomed. 

Properties like continuity, diflferentiability, and dimensionality are local 
properties, that is to say a function may be continuous or differentiable over a 
certain range but not outside this range, or otherwise a function may be con- 
tinuous or differentiable over a given range except for singular points. 

The presence of singularities in functions does not necessarily cancel their 
utility. Thus the function y = tan x contains points where it is discontinuous, 
but ordinarily it is regarded as a continuous function and the presence of these 
singular points seldom handicaps one when working with this function. Simi- 


larly, the function /=x — i — isa function which satisfies all four Axioms as 

stated in Whittaker and Robinson^s book and expresses the mode of Pearson^s 
Type III curve as a symmetric function of the measures. The fact that this 
function is not differentiable along the line xi = X 2 = xs = — • = Xn will never 
handicap the investigator for unless the frequency distribution is clearly skew 
the Type III curve would not be used to represent it. 

It seems that Mr. Wertheimer bases nearly all his criticisms on the tacit 
addition of the word everywhere*^ to Axiom IV as stated in Whittaker and 
Robinson ^s book. The word “everywhere^ ^ is not in the statement of Axiom 
IV and I assumed nothing else than stated in the axiom. 

If one deliberately adds the word “everywhere^ ^ to Axiom IV then nearly all 
my criticisms of previous writers are incorrect, unfair, and unjust. However, 
it does not seem that clearness and rigor in mathematics are increased by read- 
ing into an axiom a word that is not there. 

Consider first the criticism in my paper which remains valid even when the 
word “everywhere^' is added. (Schimmack uses the word “everywhere" on 
page 127 although Whittaker and Robinson do not.) Both Schimmack and 
Whittaker and Robinson proceed as at the top of page 217 of the book by the 
latter authors with the statement: “In this equation make fc 0 then each 

of the quantities J tends to a value which is independent of the x's • • • ." 
This statement rests on the tacit assumption that the quantities T ^1 are func- 


tions of k. Even if such were true the use of tacit assumptions in a rigorous 
proof is objectionable, but as a matter of fact these quantities are not functions 
of k. Thus the particular proof given in W'^hittaker and Robinson's book as 
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wdl as in Schiimnack’s paper is altogether lacking in rigor even when the word 
“everywhere” is added to Axiom IV. Both Schiaparelli's and Brow’s proofs 
appear to be entirely rigorous if the word “everywhere” is added to Axiom IV. 

In preparing my paper I assumed that no proUbition on functions which had 
singular points was contained in Axiom IV. In other words, I assumed mce 
the word “everywhere” did not appear there was no valid objection to intro- 
duce and discuss functions with angularities. The functions I introduced are 
everywhere continuous but they are not differentuMe along the line in Euclidian 
n-space defined by == xi = — • • • =» x*. They are differentiable at every 

other point in the space. 

It seems to me since Axiom IV as stated in Whittaker and Robinson's book 
does not exclude functions which are not everywhere differentiable that all my 
criticism is fmr and just, and moreover nearly all my statements are correct. 
Mr. Wertheimer is entirely correct in pointing out that the words “everywhere” 
on page 181 of my paper are contradictory. As a matter of fact the whole 
paragraph beginning with line 7 on page 181 appears to me, on reexamining it, 
to be unsatisfactory. Except for ttds angle paragraph I believe my paper to 
be rigoroiw, but I welcome further criticism. 

Mr. Wertheimer's conclusions in his paragraph number 4 are clearly errone- 
ous. To show this, consider a function of k. Aak—*0 any one of three situa- 
tions may arise, namely: (1) The function may become infinite, (2) the func- 
tion may become indeterminate, that is it may take on any value whatever, 
(3) the function may approach a unique finite value independent of k. Neither 
Schimmack nor Whittaker and Robinson nor Mr. Wertheimer has established 
as a definite fact that the particular type of function here in question approaches 
a unique finite value independent of fc as k — > 0. The truth of the matter is that 
this conclusion cannot be established because the function in question does not 
involve k either explicitly or implicitly. 

In conclusion there are two things I wish to emphasize. First, even when 
the word “everywhere” is added to Axiom IV, the proof given in Whittaker 
and Robinson's book is faulty, but if one consults the references given there 
in the footnotes he will find two other proofs which are rigorous with this ad- 
dition to Axiom IV. Second, the mode of a skew bell shaped Pearson Fre- 
quency Curve satisfies all four axioms as stated in Whittaker and Robinson's 
book, and the fact that these expressions for the mode are not differentiable 
along a certain line is never a handicap to the statistician. 


Ororob Washington Universitt. 



CORRELATION SURFACES OF TWO OR MORE INDICES WHEN THE 
COMPONENTS OF THE INDICES ARE NORMALLY DISTRIBUTED 


Br George A. Baker 

Indices are widely used in statistical anal 3 r 8 es/ In many cases incorrect 
conclusions are drawn because indices are not uncorrelated or independent even 
though all of the component variables arc independent. In a previous paper* 
the distribution of an index both of whose components follow the normal law was 
given exactly i.e. without approximation. The purpose of the present paper is 
to give the simultaneous distribution of two or more indices when each of the 
components follow the normal law. The case for two indices will be discussed 
in detail and the exten.siou to more indices will be indicated. 

Let Xi, Xi, and Xz , be correlated variables each being normally distributed 
about their respective means mi,mz,mz, with standard deviations v\, at , cz , 
and let the correlations between the variables in pairs be represented by ru , 
ri 3 , ras . Then the simultaneous distribution of these three variables will be 

1 ^ ^ r ~ I ~ I R^zixz — ntj)* 

'i2irymaiataz 2 Rl ‘ a| 


( 1 ) 


+ 2Rit -j- 2Ra 

(Ti(T2 

{xt - mt){xz - mzY 


I OP ~ — "■») 1 , . 

-j- dXi ax2 ctXi 


0’2 ^3 


where 


j 1 ra Viz 
R = \ri2 1 rjsl 


I rjj r23 1 


and Rij are the respective second order minors of R. 


^ Rietz^ H. L. ‘*On the Frequency Distribution of Certain Ratios,” Annals of Mathe- 
matical Statistics, Vol. VII, No. 3, Sept. 1936, pp. 145-153. 

* Baker, G. A., ‘‘Distribution of the Means Divided by the Standard Deviations of 
Samples From Non-homogeneous Populations,” Annals of Mathematical Statistics, Feb. 
1932, pp, 3-5. 
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If we make the tranef ormation 


Xi 

Zi « — , 

Xl sag ZiZa 


22 * 

Xa « 222s 

Xt 

Za SB X8, 

Xa »= Zs 

dxi dxa dxa 

= zidzidzadza 


which is certainly valid if Xi , Xf , xs , are all positive, then (1) becomes 

1 1 1 fP ^ 

2Rl 


exp. 


( 2 ) 


(2ir)^R^ crifft<^z 
. Rziizt — fw«)* 


RiiiziZi mi^ R7i{z%Zi — ma)* 


2 

<ri 


A 


+ 2Pii — rridizzZz ma) ^ (^iZa - Wi)(aj| - wia) 




+ 2/Ja8 


oia% 


(ztZz - mid(zi — m«) 

(raiTs 






If xi , zt > Xt are all positive the corresponding distribution of Zi and Za can be 
obtained by integrating (2) between the limits 0 and » with respect to Zg . 
If Xi, XI, x* are all negative Zi and zt are again both positive so that in order to 
get the total distribution for Zi and Za it is necessary to add to the integral of (2) 
between the limits 0 and <» with respect to z$ the similar integral of (2) with za 
replaced by — z» . The result is 

1 e 1 6> 

2e“ 


(3) 


where 


a H^a Ka 


(2T)*ie* 


(Tl flTa <Tz 


^\/2 a* a* Jo 


dz + 


2 * 6 * W 
^ \/2_ 


Rii 

i 

<ri 


11 2 , R22 2 , -ftss , 

i jf 22 i — 2 ^ “T * ^ 

(Ta 


Ras 

2 

<^8 


2Pia 
<ri (Ta 


2Pi8 ^ 

CTlCTs 


2/2,8 

ora 0*8 


P18 


b Pii I P22 I Pas I P12 I P12 I -w 

= — ^ miZi + ^ ma^a H ^ wis H «iWa H miZa H ma^i 

ffi era 0"$ cTior, <^i<ra o'lO'a 


, P 18 I P 28 . Ras 

H mi H mgZa H ma 


crias 
2Ria 


0*2 0*8 


0*2 O’, 


Pii 2 I Raa 2 I Pss 2 I 2Pi2 2Pi8 , 2Raa 

— Y’ H — 2 ^ H — 2" ^9 H Wima H m\ma H m^ma, 

<T\ <Ta <^9 or. O’! O’, 


O', 0*8 


The same result (3) is obtained for Zi , and Za negative, Zi positive and za 
negative, Zi negative and Za positive. That is (3) is the simultaneous distribution 
of Zi and Za. The extension to more than 2 indices is immediate. The form of 
the distribution of the indices and the denominator variable is the same as (2) 
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except that a, b, and c, the coefficients of zl , Zz and the constant term respectively 
in the exponent of 6, will be different in that they will include the new indices and 
the exponent on the denominator variable will be the same as the number of 
indices involved. The distribution of the indices will again be obtained by 
integrating from 0 to oo with respect to the denominator variable. 

The case when all of the variables Xi, Xzf xz are independent is especially 
interesting. If ru , ns , tm are all zero then R == Rn = = Rzz ^ Rn ^ 

Riz Ru ^ 0 and a, ft, c, become a', 6', c', respectively. 


4- O- -L 
* • * ' * 
ffi at az 


V 

e 


miZi . mtZt , wj 

i * 8 “T 1 

(Ti at az 

8 ‘Z t 

Ifll . frit 

“i • S “l 

ai at <fz 


Under these conditions and the further condition that mi , tbs , ms are large with 
respect to <ri , <r 2 , da respectively so that the integral term of (3) maybe neglected 
(3) becomes 


(4) 


( wf m* 

-+-H 





1 + 


/ rniZi mtZj mV 
\ <rj 

( 4 + 4 +^) 

Vi at az/ 


It is clear that Zi and Zt are not independent in the probability sense for dis- 
tribution (4). 

The question as to the possibility of having the variables independent and the 
indices independent at the same time arises. Denote the distribution functions 
of Xi , , Xa , by Xife), Xtixt), Xz(xz) and of Zi , z* by Zi(zi), Zt(zt)- Then, if 

Xi>0,i== 1 , 2, 3 it is necessary that 

(6) j Xi(zaZi)X2(z8Zt)Xs(za)z8 dza = Zi{z^Zt{z^ 

a and b being suitable limits. 

For instance, let 

Xiixj) = 1 < xi < 3 

Xl 


X»(x^ = 4, 1 < xt < 3 

Xj 

X,(a^) - X?, 1 < X, < 2 



182 


OB<»tOS ▲. BAKBB 


then 


Ziizd - -i 

2l 

ZtW = 1 
2 * 

for value of zi and zt within a straight line sided area the comers of which are 
i)> 1)> (1> 1) (1) 2), Zi , and Zt are not uncorrelated throughout their 

entire set of values but are for this particular set of values. Thus is appears 
that it is possible that the indices may be independent when the variables are, 
but not necessarily so. 

Indices should be used with care since it is very easy to draw invalid conclu- 
sions from the consideration of them. Usually it is better to use partial corre- 
lation analysis to remove the influence of a third factor than to calculate indices. 



THE TYPE B GRAM-CHARLIER SERIES 


By Leo A. Aroian 

While much attention has been devoted to the Type A Gram-Charlier series 
for the graduation of frequency curves, the Type B series has been somewhat 
neglected. However the numerical examples to be presented later will show 
that the Type B series is very useful for the graduation of skew frequency 
curves. Wicksell^ has demonstrated that the Gram-Charlier series may be 
developed from the same law of probability which forms the basis of the Pearson 
system of frequency curves. Rietz^ following Wicksell gives a derivation of the 
Gram-Charlier series based on the binomial {q + p^. Jordan* gives a method 
for fitting Type B based on certain orthogonal polynomials which he calls G, 
He uses factorial moments because of the resulting ease in finding the values 
of the constants. 

We shall consider the Type B series for a distribution of equally distanced 
ordinates at non-negative values of x. We shall find the values of the first few 
terms of the series and shall also show how the values of later coeflScients may 
easily be found. We write the Type B series in the form 

(1) F(x) = Co + CjA^(x) + C2A^\[/(x) + CsAVC^) + C4AV(^) + C6AV(^) + CoAVC^c) 
where 


^(x) = the mean, 

(2) 

A^(x) = \l/(x) — \l/{x — 1) for X = 0, 1, 2, • • • s. 

Let f(x) give the ordinates of the observed distribution of relative frequencies, 
so that S/(j) = 1. To determine the coefficients cd , Ci , C 2 , • • • , Ce , we have, 
using the method of moments, 

2[coV'(a:) + CiA^(i) + CjAV(x) + CbAVCx) + + CiA^\p{x)] = lf{x) = 1. 

Ilxlcaipix) + ciA^(x) + + ctA*}l/(x)] — 2a/(x) = m. 

Sa;*(coV'(a-) + CiA^(ar) + + CeAVW] = • 

(3) 2xW(a:) + + CeAV(x)] = 2xV(x) = 

2*W(a:) + + c4V(a;)] = 2xV(x) = 

2x‘[coiA(a:) + + c«aV(x)] = 2x‘/(a:) = Ms • 

2xW(aj) + + c,aV(i)) = Sa:‘/(x) = • 
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Hence we must find the values of 


( 4 ) 




n = 0, 1, 2, 3 • • • 

p = 0, 1, 2. 3 • • • 


defining aV(x) s: ^(x). We assume that we are dealing with distributions 

00 

in which a is large, and that the error involved in substituting x”A’’^(x) for 

«*"0 

0 

£ is negligible. To find these summations in a straightforward 

manner would involve too much labor, so we shall briefly discuss some properties 

of the generating function, }(/(x) *= — r— , the Poisson exponential, very useful 

xl 

in the graduation of frequency distributions of rare events. The first eight 
moments about the origin are; 

fi!, zs 1 = 2^(35), ix'i zs m — 'Sx4>(x), /ij = m + m* = S*V(x) 

= m + 3jn* + »»* = 2xV(x) 
juJ = m + 7m* + 6m* -+• m* = 2xV(x) 

(6) /ij = m "i- 15 m* -(■ 25m* 10m* *+- m* = 2xV(x) 

lit = m + 31m* + 90m* + 65m* + 15m* + m* = 2x*\f'(x) 

iij az m 63m* + 301m* 4- 350 m* + 140m* + 21m® + m* = 2 xV(x) 

iuj = m + 127m* + 966m* + 1701m* + 1050m* + 256m* + 28m* + m* 

= 2x*iA(*) 

These may be found by the formula given by Jordan,* 


( 8 ) 

Proof: W _ 

dm m 


We multiply by x" and sum, giving (6). This result may readily be proved also 
by means of recursion formulas without differentiation. Now we must find the 
values of 


2 x"A’’^(x) 


n =s 0, 1, 2, 
P “ 1,2,3, 


We do this by proving 

T, x“A**-V(x) 

x»0 


-;^Zx"AV(x). 

am iD-o 


( 7 ) 
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Now 

(8) ^ ^(x - 1) _ ^(x) = -A^(x). 

Hence 

+ (*)^(* - 2) + . . . + (-l)V(x - «)], 

since AX®) = ^(x) - - 1) + - 2) + • • • + (-1)V(* - «). 

Then by (8) 

£aVW = [iA(x - 1) - - 2) + (j)^(x - 1) 

+ - 3) - - 2) + . • • + (-l)Xx - s - 1) 

- (-1)V(® - s)"]. 

(9) AV(*) = -Hx) + (* I ^y^x - 1) _ (« + - 2) + . . . 

— (— i)V(® — « — 1). 

= _[^(x) - (« + l)^(x _ 1) + (« + - 2) + . . . 

+ (-1)V(* - 8 - 1)J. 

= -A*^V(x). 

We multiply (9) by x"*, sum with respect to x, giving (7). 

Thus by use of (7) and (5) we get: 

2AV(x) = 0, p =1,2,3,... 

XxArpix) = — ^ = — 1. 
dm 

(10) = -^ (m + m*) = -2m - 1. 

Sx*A^(x) =B —3m* — 6m — 1. 

Sx*A^^'(x) = —4m* — 18m* — 14m — 1. 

2xV(x) = -5m* - 40m‘ - 75m* - 30m - 1. 
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= -6m‘ - 75m* - 260m* - 270m* - 62m - 1. 

XxA^4>(x) = 0, 2x*A*^(x) = 2, XxWtPix) = 6m + 6. 

Xx*A*i(x) = 12m* + 36m + 14. 

Sx*A*(A(x) = 20m* + 120m* + 150m + 30. 

IxWHx) = 30m* + 300m* + 780 m* + 640m + 62. 

SxA*iA(i) = 0, Sx*A*iA(x) = 0, Sx*A*iA(x) = -6. 

Sx*A*^(x) = -24m - 36, 7^x*A*^(x) = -60m* - 240m - 150. 

2x*A*^(x) = -120m* - 900m* - 1560m - 540. 

(10) 2xAV(a:) = 0, 2x*AV(a:) = 0, 2x*aV(x) = 24. 

2x*AV(a:) = 120m + 240, 2x*AV(a:) = 0. 

2x‘AV(a:) = 360m* + 1800m + 1560. 

2xAV(x) = 0, 2xAV(x) = 0. 

2x*aV(x) = 0, 2x*AV(a:) = 0. 

2x*aV(x) = 0, 2x*aV(x) = 0. 

2x*aV(x) = 0, 2x*AV(a:) = 0. 

2x*aV(x) = - 120, 2x*aV(x) = 0. 

2x*aV(x) = -720m - 1800, 2x*aV(3^) = 720. 

Finally we substitute from (5) and (10) into (3), and for we substitute 
Hence 

Co = 1 

Cl = 0 

C2 = i (m 2 - m). 

(11) Ca = — i (m8 — 3m2 + 2 m ). 

Ci = [m4 — 6^3 + M 2 (ll — 6m) + 3m(m — 2)]. 

C6 = — ^Imb — 10/X4 — M8(10m — 25) + 50/12 (m — 1) — 4m(5m — 6)]. 

5! 

Co = i [m — 15^0 + /<4(85 — 15m) + /us(130m — 225) + /xs(46m* — 375m 
6! 

+ 274) — 16m* + 130m* — 120m]. 

It may be asked whether criteria may be given as guides for the use of Type 
In general Type B may be tried if either the skewness of the distribution to 





TYPE B GRAII-CHARLIER SERIES 


187 


fitted is considerable, as = ^ > .6, or if m = /U 2 = approximately. The 

/i2 

latter condition strictly would mean that ^{x) alone is sufficient for a good 
graduation, if the fourth moment, /i 4 , is not used. The examples which follow 
are arranged to facilitate comparison with the Pearson system of frequency 
curves. We have an example each of Type I, III, IV, V, VI, and an example of 
the normal curve. 

Type /. Table 1. Here as > .6 although m /U 2 Ms • The first four 
moments, unadjusted, give an excellent fit by Type B, which is not quite as good 
as Type I. The degrees of freedom, according to Fisher,* have been taken into 
consideration here in applying the test. The two classes 13, 14, were grouped 
together for the test. The actual numerical work is easily done on a cal- 
culating machine, although logarithms are necessary to find the value of 
This example and the remaining are all taken from Elderton^ with the exception 
of Type IV which is from A. Fisher.® 

Type III. Table 2. The unadjusted moments are used. Here as = 2.0833 
> .6, and m = 1 x 2 approximately. The fit by Type B is slightly better than that 
by Type III. We have for Type III P{x^ > 12.8) = .007, n = 3, while for Type 
B, P{x^) > 9.4 == .025 n = 3. Moreover the standard error of prediction for 
Type III is 1 1 .2 and for Type B is 7.7. 

Type IV. Table 3. The rough moments were used. Although as = .48 < .6, 
Type B gives a fine fit since m = m 2 = M 3 approximately. Here the results are 
given for Type B using 2, 3, and 4 terms of the series. This was done to show 
how the distribution changes with the addition of more terms. The superiority 
of Type B over Type IV is evident. The results for Type IV are taken from the 
class notes of Professor C. C. Craig. 

Type V. Table 4. Using the adjusted moments we have a comparison among 
Types V, A, and B. While the graduations may seem satisfactory, the test 
shows that the fit is poor in each case. The order of merit is Type V, Type B, 
and then Type A. The negative frequencies which appear in Type B may be 
due to the use of the adjusted moments. If we u^e the rough moments, the 
negative frequencies disappear. On the whole the fit by means of the adjusted 
moments is superior. 

Type VI. Table 5. Type VI using the adjusted moments gives an excellent 
fit. Even though as is considerable, and ^2 = ms approximately, four moments 
with Type B give a poor fit, and five moments, adjusted, achieve a very small 
gain. Five moments using the unadjusted moments give some improvement, 
but the — 2 frequency in the first class is objectionable. 

Normal Curve. Table 6. The normal curve provides a fine fit. P{x^ > .9) = 
.96, n = 6. The first two and the last two classes were grouped together for the 
test. The fit by Type B is less probable, P(x* > 8) = .15, n = 5. Type B has 
two discrepancies, the negative frequencies, and the fact that the total fre- 
quencies (neglecting the —1) is 352. That Type B does so well is in itself 
quite amazing! 
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TABLE 1 


X 

Actual frequency 

Freouency computed 
by rearson Type I 

Frequency given 
by Type B 

0 

34 

44 

42.4 

1 

145 

137 

121.3 

2 

156 

149 

168.7 

3 

145 

142 

156.8 

4 

123 

127 

120.5 

5 

103 

108 

94.9 

6 

86 

88 

82.9 

7 

71 

69 

72.2 

8 

55 

51 

56.7 

9 

37 

36 

38.0 

10 

21 

24 

23.1 

11 

13 

14 

12.0 

12 

7 

7 

5.7 

13 

3 

3 

2.4 

14 

1 

1 

.9 


m = 4.175 
Mj = 7.66237 
MS = 15.1069 
#i4 = 173.326 


«, = .712247 
a* = 2.95214 
c, = 1.74368 
c, = -.078298 
c« = + .094592 


TypeIP(x* > 4.36) = .88 
n (number of degrees of 
freedom) = 9 

TypeBP(a:* >9.67) = .37 
n = 9 


F{x) = i^(x) +1.74368 A V(a:) -. 078298 A V(a;) + .094592 A V(x). 


TABLE 2 


X 

Actual frequency 

Frequency confuted 
Dy Type 111 

Fequency by 
Type B 

0 

44 

59 

48.1 

1 

135 

111 

121.6 

2 

45 

45 

58.5 

3 

12 

20 

10.4 

4 

8 

9 

3.5 

5 

3 

4 

4.3 

6 

1 

2 

2.9 

7 

3 

1 

1.2 


OT= 1.33466 as =-^= 2.0833 Cj = .05356 
/4, - 1.44179 c, * - .32510 

Ht = 3.60662 

Fix) = tf>ix) + .06356AV(x) - .32610AV(x) 
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TABLE 3 


Number of alpha particUe from a bar of polonium in intervals of i of one minute 


* 

Frequency 

Type IV 

Type B 

2 terms 

TypeB 

3 terms 

TypeB 

4 terms 

0 

57 

50 

49.5 

49.0 

58.2 

1 

203 

183 

201.3 


199.8 

2 

383 

392 



386.1 

3 

525 

544 

532.3 

533.8 

523.9 

4 

532 

539 


521.5 

532.1 

5 

408 

417 



418.2 

6 

273 

250 

254.8 

254.4 

260.2 

7 

139 

131 

137.1 

136.7 

134.0 

8 

45 

61 


63.9 

66.7 

9 

27 

26 

26.1 

26.2 

22.9 

10 

10 

12 

9.4 

9.6 

8.6 

11 

4 

4 


3.1 

3.6 

12 

0 

1 

.9 

.9 

1.6 

13 

1 

0 

.2 

.2 

.8 

14 

1 

0 


.0 

.3 


m » 3.87155 a, » .47844 

m = 3.69477 a« = 3.506536 

3.39791 
m = 47.86888 

Fix) = 4fix) - .08839AV(a:) - .00930AV(*) + .16810AV(*). 

Type B, 4 terms P(x* > 4.50) = .72, n = 7 
Type IV Fix* > 10.8) = .16, n = 7 
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TABLE 4 


Mortality Among female Nominees 


X 

Deaths 

Elderton 
Type V 

Type A 

TypeB 

2 terms 

TypeB 

3 terms 

TypeB 

6 terms 

TypeB 

5 terms 

0 

4 

4 

2 

1.4 

-6.9 

-.4 

4.1 

1 

18 

10 

15 

26.3 

7.1 

9.4 

13.1 

2 

53 

80 

78 

109.7 

100.1 

84.6 

77.4 

3 

265 

261 

235 

248.3 

268.4 

252.3 

242.5 

4 

438 

441 

426 

379.5 

418.8 

425.9 

427.4 

5 

525 

480 

521 

432.7 

461.0 

484.0 

494.1 

6 

342 

381 

411 

388.8 

388.4 

402.6 

408.1 

7 

253 

247 

225 

285.4 

263.5 

259.0 

253.9 

8 

128 

137 

107 

170.8 

145.5 

132.2 

124.9 

9 

82 

68 

66 

84.3 

68.3 

58.6 

54.1 

10 

28 

32 

44 

32.9 

28.2 

26.2 

26.4 

11 

12 

14 

22 

8.6 

11.0 

13.9 

16.4 

12 

8 

6 

8 

-.01 

4.7 

8.2 

10.7 

13 

5 

3 

2 

-2.1 

2.1 

4.3 

5.9 

14 

1 

1 

0 

-1.5 

1.3 

2.0 

2.5 


Adjusted moments: 
m = 5.30435 a, = .703564 
/i, = 3.573345 a4 = 3.996196 
M» = +4.752437 
M4 = 51.02659 
Mt 193.439125 


Rough moments: 
m = 5.30435 
H = 3.65668 
t», = 4.752437 
V4 = 52.85276 
rs = 197.39949 


Type A: /(t) = v(t) + .117261 ^*(<) + .041508^>«(<) 

Type B: F(x) = i^x) - .86550AV(a:) - .77352AV(*) 

+ .02814AV(a:) + .57459AV(a:) 

Using uncorrected moments 

TypeB: F(x) = ^(x) - .82384AV(x) - .73185AV(a:) 

+ .03192AV(a:) + .94033AV(a:) 

(last column above) 
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TABLE 5 


z 

Frequency 

Type VI 

TypeB 

4 terms 

Type B 

5 terms 

0 

1 

1 

-9.5 

-2.0 

1 

66 

50 

83.2 

69.9 

2 

167 

168 

141.6 

143.1 

3 

98 

100 

102.3 

110.7 

4 

34 

36 

41.5 

40.2 

5 

9 

10 

8.7 

4.6 

6 

2 

2 

.05 

2.0 

7 

1 

.5 

-.4 

1.0 


Corrected moments: 
tn = 2.402174 
fit = .928835 
fit = .893096 
tn = 4.088800 


Rough moments: 
m = 2.402174 
fit = 1.012169 
fii = .893096 
M4 = 4.313176 
fii = 11.28304 
a» = .87704 
at = 4.2101 


T 3 rpe B, adjusted moments: 

F(x) = ^(x) - .73667AV(a:) - .48516AV(a:) - .06424AV(*) + .10365AV(x) 
*Type B, rough moments: 

Fix) = ^(x) - .69805AV(x) - .44654AV(x) - .06687AV(x) + . 15165A»iA(x) 

* This is used in last column of above. There is a slight error here, which however will 
not affect the results materially. The third decimal place may be slightly wrong. 
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TABLE 6 


Normal cum 


X 

Frequency 

Normal ourve 

TypeB 

0 

.6 

.6 

2.3 

1 

2.8 

2.7 

4.7 

2 

11.5 

10.9 

8.7 

3 

27.7 

30.1 

25.2 

4 

59.1 

58.4 

55.2 

5 

84.7 

80.1 

79.5 

6 

74.1 

76.9 

80.1 

7 

50.5 

52.2 

58.1 

8 

23.2 

25.0 

29.7 

9 

12.2 

8.4 

8.6 

10 

1.3 

2.4 

-.9 


Moments corrected: 

m » 5.393443 

M » 2.769635 

Mt = .029805, M4 = 22.40663 

at s .0064 

a4 » 2.920997 


TypeB: F{x) = ^(x) - 1.3119AV(x) -> .4179AV(x) + 2.1625AV(x) 

Colorado Statb Collbob 
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A TEST OF A SAMPLE VARIANCE BASED ON BOTH TAIL ENDS OF 

THE DISTRIBUTION 

Bt John W. Fkbtig 

With the assistance of Elizabeth A. Proehl' 

(1) Introdttctioii 

In testing the hypothesis, say Ht, that an observed sample E of size N has 
been drawn from a normal population for which the standard deviation, <r, has a 


particular value, <ro , one may form the ratio 

V = S (xi — tn)*/cl - ^ (I) 

t-i ffj 

if the population mean m be known, or 

v' ^ s (*.- ^ (II) 


where x is the sample mean, if the population mean be unknown. The proba- 
bility of obtaining a larger (or smaller) value of v or v' than that observed may 
readily be obtained from the appropriate tail area of the distribution with 
n = JV’orn = (iV’ — 1) degrees of freedom respectively. The alternative 
hypotheses to Ho concerning the normal populations from which the sample 
may have been drawn assign different values to a and form a set of hypotheses, 
12. The members of 12 may be classed according to whether they specify 
(T > (To , or O' < (To . The practice of regarding only one tail of the distribution, 
the upper or lower depending on whether v > N or v < Ny is tantamount to 
accepting as admissible alternatives to Ho only one of the classes of 12. 

The alternatives may sometimes be limited to one <;lass or the other through 
some a priori knowledge, or the problem may be such that only one of the classes 
is relevant. However, since this is not generally the case, some method of 
considering all of the alternatives is needed. When testing hypotheses con- 
cerning the mean of the sampled population, the problem is quite simple, since 
the distribution of means is symmetrical. Thus, the “corresponding'' value to 
any positive deviation, (x — m), is the negative deviation of the same magnitude. 
Merely doubling the tail area pertaining to either of the deviations will serve to 
take account of both classes of alternatives, i.e., those in which m > m and 
those in which m < nia , The problem is more difficult in the case of v or 

^ From the Memorial Foundation for Neuro-Endocrine Research and the Research 
Service of the Worcester State Hospital, Worcester, Massachusetts. 
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id4 

since the distribution is not symmetrical. In addition to the value of v or v' 
pertaining to the observed sample we require a ^‘corresponding’’ value at the 
other end of the distribution. The definition of “corresponding” which is 
accepted will determine the required value. There may be a number of such 
definitions but not all of these will be equally acceptable. The value of v 
which delimits an equal tail area specifies one of the possible definitions of 
“corresponding.” Another definition would require that the ordinates at the 
two values of t; be equal. 

The Neyman and Pearson Approach. Generalized procedures for testing 
statistical hypotheses have been elaborated in recent years by J. Neyman and 
E. S. Pearson (1-5). These have considerable philosophical appeal and will be 
traced as a basis of solution of the immediate problem. A test of a hypothesis 
jy© consists essentially of a rule for rejecting when the observed sample E 
falls within a suitable critical region w of the A-dimensioned sample space W, 
and of accepting Hq when E falls in {W w). In testing any h 3 pothesis two 
types of error may be made : 

i) Hq may be rejected when it is true; 

ii) Ha may be accepted when some alternative hypothesis, Hi , is true. 
Errors of the first kind may be considered “equivalent” since, if a true hypoth- 
esis is to be rejected, it is immaterial which one is chosen. Furthermore, the 
first type of error can be controlled through our choice of the size of w, say a. 
The size of w represents the probability of a sample E being an element of w 
when the hypothesis Ha is true. This probability may be designated briefly as 
P\E€w\Ha\^ Then 

P{Eiw\Ho) = j ■ ■■ j piE\ Ha) dxidxt • • • dxj^ = a . . . (Ill) 

where p(E | Ha) is the elementary probability law of the sample when Ho is 
true, i.e., 

p(£?|Ho) = p(xi,x,,...a:^|Ho) (IV) 

Errors of the second type, however, are not equivalent, since their consequences 
depend on the difference of the true hypothesis from Ha . The utility of a test 
of Ha will depend largely on how it controls the second type of error. Ideally, 
the selection of a critical region should take into consideration the probabilities 
& priori of the h 3 rpotheses comp)Osing 12. Since these probabilities are generally 
unknown, tests may be sought which are valid independently of them. 

A distinction must be made between simple hypotheses which specify com- 
pletely the elementary probability law of the sample, p(E)j and composite hy- 
potheses which specify the law subject to one or more undetermined parameters. 

(2) Simple Hypothesis Concerning Population Variance 

A test based on a critical region Wa may be called independent of the probabili- 
ties & priori of the alternative h 3 rpothe 8 es if it is more powerful than any other 
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equivalent test for all of the alternative hypotheses (3). An equivalent test 
is one based on a region wi of the same size, a, i.e., 

P{E€Wo\Ho} =P{E€Wi\Ho} (V) 

The power of a test based on any critical region, as , is the probability of its 
rejecting a hypothesis Ho when some other hypothesis Hi is true. That is, 
it is the probability of E falling in Wi when Hi is true. Denote this power by 
P{E €Wi \ Hi], The greater the power of a test, the smaller the risk of the 
second type of error. If tests as defined above exist, they minimize the proba- 
bility of the second type of error. Furthermore, the probability of the first 
type of error is no larger than a. Neyman and Pearson (2) have designated 
regions satisfying this definition as Best Critical Regions for testing Ho with 
regard to the set Q. If there is no such Best Critical Region, some compromise 
region must be chosen. 

A necessary and sufficient condition for tt?o to be a Best Critical Region with 
regard to an alternative Hi is that within Wq 

p{E\Ho) <kp{E\Hi) (VI) 

where k is some constant depending on a. If this inequality is true for any Hi , 
Wo will be a Best Critical Region for the set Q. 

Neyman and Pearson (2) have shown that in testing the hypothesis that 
<r = (To , when the population mean m is known, there are two Best Critical 
regions, one pertaining to the class of alternatives for which cr < <ro and defined 
by t; < Vi , the other to the class <r > o-o defined by v > V 2 * Vi and V 2 are values 
of V so chosen that the size of the critical region shall be a. Although there is 
no Best Critical Region for all of the alternatives, the choice of a compromise 
critical region should still depend on its control of the second source of error, 
that is, on its power for the various alternatives (4). Such a compromise 
region may be designated as a Good Critical Region. What is needed is a 
region Wo of size a defined by the inequalities v < Vi and v > V 2 , If and V 2 
are taken as the values cutting off equal tail areas, then the power of the test 
will be less than a for some values of cr less than cro . For those values of <r, Ho 
would be accepted more frequently than if it were true. Thus a first require- 
ment for a Good Critical Region is that its power should nowhere be less than a, 
the value when Ho is true. Of all such unbiassed Critical Regions of size a, 
Wo should then be selected so that its power is everywhere greater than that of 
any other equivalent unbiassed region. 

Critical Regions sufficiently satisfying the above requirements can often be 
obtained by stipulating that the first derivative of the power function with 
respect to 6y the parameter under consideration, shall be zero at ^ , and 

that the second shall be a maximum there. Then not only does the probability 
of the second source of error decrease as we move away from , but it decreases 
most rapidly in the vicinity of 0o . Critical Regions satisfying these conditions 
are called unbiassed Critical Regions of Type A, (4). Under certain assumptions 
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concerning the nature of the elementary probability law p(£ | B) it can be shown 
that Wo is defined by the inequalities tpi < ci and (pt > ct where ci and a satisfy 
the conditions 


/■ 


pM = 1 — a 


(VII) 


f 


fPipM d^pi = 0 


(VIII) 


where 


— 


d log p(E I B) 
d$ 


01^0 


.(IX) 


and pM is the distribution function of <pi . 

In applying these results to the testing of the hypothesis that cr = <ro* when 
the population mean is known, 

(v^ N)/a, (X) 


Obviously p{v), the distribution of t;, may be considered instead of p(v?i). is 
defined by the inequalities v <Vi and v>v% where 


/ p{v) dv + f p(v) dv = 
Jo Jvt 

f 


ai + a2 == a 


NI2 -vf2 


(v — N)p(v) dv = e 


Vi 


= 0 


..(XI) 

.(XII) 


M^o so defined is also of type Ai, that is, its power curv’^e lies everywhere 
above that of any other equivalent region, vanishing in the first derivative at 
(T = <ro , (4). 

The use of Wq as the appropriate critical region is equivalent to the use of r 
as a test criterion, where 

= r) XIII) 


That is, a value of v jdelding the same r as the observed v may be taken as the 
corresponding value. Reference to the appropriate tables and summing of the 
two tail areas gives Pr , the probability of obtaining a smaller value of r when 
Ho is true. Ho may be rejected if Pr is less than some previously fixed number, 
say a. If the distribution of r could be evaluated the necessity of dealing with 
two values of v would be obviated. 

The criterion r is equivalent to that deduced by the use of maximum likelihood 
ratios (6). Thus, 


piE\a^ 


{2ira) 




N 

.8(xi- 


•»n)*/2<r* 


(XIV) 


• The solution is the same in terms of 
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Maxiinizing p(E | a*) for fixed E and all possible <r* we have 


X == (XVI) 

pm»x.{E\<r^) 

= ..(XVII) 



The A**" moment coefficient of X about zero, ma(A), is given by 

^ p(l + A) 1 

.AM = L . ? ■■ J 4- (X 

V{N/2) 




TABLE I 

Prohahility that a sample has been drawn from a normal population with a specified variance or standard deviation 

Degrees of Freedom, n 
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For N infinite, (— 21ogtX) will be distributed as x* with one degree of freedom. 
For finite values of JNT, however, we have not been able to evaluate the dis- 
tribution of X, although the distribution of the Incomplete Beta Function serves 
as a good approximation. Approximate distributions for several values of N 
have been obtained. Px, the probability of obtaining a smaller value of X 
than that observed, as obtained from these distributions agrees well with the 
sum of the tail areas pertaining to vi and V 2 yielding the same value of X (or r). 
The construction of tables is simplified by taking (1) 


logioX = i\r/2(logioe- A:) (XIX) 

That is, 

x~log,a: = fclog. 10 (XX) 


where x = v/N. Equation (XX) is independent of N and may be solved once 
and for all for a:, given A:.® In Figure 1 is plotted the graph of equation (XX). 
For convenience, the branch of the curve giving the roots greater than unity 
has been folded back with altered scale from the minimum value of fc, logioc, 
occurring at a: = 1. Table I was then constructed by multiplying the two 
values of x for a given k by (N/2)^, referring to the Tables of the Incomplete 
Gamma Function (7) with p = (iV — 2)/2, and adding the resulting two tail 
areas. The values for the odd numbers above 12 were obtained by interpolating 
between the even numbers. For A = 1, (a^)* was used as a normal deviate. 
The values in Table I should be correct to four decimals. Table I is entered 
with the number of degrees of freedom, n, on which x is based. In the case of the 
simple hypothesis this is N, 

The following may serve as an illustration: Blood urea nitrogen determinations 
(mg./lOO cc.) were made on a sample of 25 schizophrenic patients. The mean 
was found to be 15.56, the variance, 10.486. Previous investigation of blood 
urea nitrogen on a large sample of nonnal control subjects gave a mean of 16.03 
and a variance of 20.268, which for the purpose of the example may be considered 
as the population parameters. Then we may wish to test the hypothesis that 
the variance of the sampled population, cr* , is = 20.268, knowing the mean 
of the sampled population to be 16.03. Calculate 

- + .528 

Referring to Fig. 1, the value of fc is about .505. Turning to Table I with 
k = .505/ n — 25, P is found to be .0457. We should thus be inclined to reject 
the hypothesis. 

For N small, the area of the tail of the distribution near zero is considerably 
larger than that at the upper end. As N increases the distribution of v becomes 


* If the solution were explicit the distribution of X could easily be deduced from that of z. 

* h obtained directly from {XX) is .607, correspondin^f to P «« .0427. 
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more and more symmetrical and the two areas approach equality. Even for 
N = 50, however, they are rather unequal, so that merely doubling the area 
pertaining to the observed v does not give a sufEiciently accurate approximation. 
For AT > 50 an approximation correct within several units in the third decimal 
place may be obtained by taking ■y/^iy/x — 1) as a normal deviate. This 
assumes that the standard deviation is normally distributed with variance (r\/2N. 

(3) Composite Hypothesis Concerning Population Variance 

Here Ho specifies only the value of the parameter = tfo , leaving undetermined 
the value of a second parameter, v. Thus, Ho consists of a subset, w, of simple 
hypotheses, each of which specifies a different value for v. Any simple hypoth- 
esis specifsdng different values of both parameters, 6 and v, is an alternative 
to Ho . These alternatives form the set Q. The elementary probability law 
determined by Ho is piE \ Ho) = p(H | dov), while that determined by an altemar 
tive hypothesis H,- is p(E | H<) = p{E | In testing composite hypotheses 

the first requirement is to find regions “similar” to W with regard to v, i.e., such 
that the chance of rejection of a true hypothesis, P{E tw \ Ho], equals a. for all 
the values of v specified by the simple hsrpotheses composing Ho . A test based 
on a similar region wo may be called independent of the probabilities & priori, 
if its power with respect to all the alternatives of 0 is greater than that of any 
other similar region Wi of the same size, a, (3). I^et 

^ = d log p{E I dv)/ dp |o_og (^-CI) 

Then the equations <pi == constant will describe hypersurfaces in iV-dimensioned 
space, on one of which the observed E must fall. Under certain assumptions 
pertaining to the law of elementary probability it can be shown (2) that a 


necessary and sufficient condition for to be a similar region is that 

P{E ew(<p2)\Ho} aP{E eWMlHo} (XXII) 

for all values of ^ , where w((ip 2 ) and W{if>^ are parts of the surface <p 2 = constant 


common to w and W respectively, A similar region is then built up of these 
parts w{ipi) obtaining for the various values of v> 2 . The Best Critical Region, 
Wq , for a particular simple alternative. Hi , must then be composed of pieces, 
maximizing P[E t 1 The problem is the same as for simple 

hypotheses except that we shall be working in a space TF(^) of (A — 1) dimen- 
sions. is defined by the inequality 

p(E 1 Hi) > kM p{E 1 Ho) (XXIII) 

where k(ip 2 ) is some constant depending on a. If Wo{<p 2 ) is the same for all Hi , 
then Wo is the Best Critical Region for testing Ho with respect to Q. 

Neyman and Pearson showed (2) that in testing the composite hypothesis that 
<T = (To when the population mean is unknown there are two Best Critical Regions 
corresponding to the class of alternatives cr < o-o and <r > ao, defined respectively 
by the inequalities v' < Vi and • If the whole set of alternatives, R, is to 
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be considered some compromise region must be sought. Dealing with the case 
where similar regions exist Nesrman (5) defines a Critical Begion as unbiassed 
and of T 3 rpe B if the first derivative of the power function, P{E c w | Hi), with 
respect to 0 vanishes at = On , and if the second derivative at that point is a 
maximum. Let 


<pi = 


3 log p(E 1 0v) 
00 


.(XXIV) 


Then it can be shown that the desired region will be defined by the inequalities 
<pi < and > ktiipt) where ki(<pt) and are determined to satisfy 


and 


/ = (1 - (XXV) 

Jkjipi) 


rkiW2) r* 

I d<pi = (1 — a) / <pip(.vi<pi) dipi (XXVI) 

Jki(^i) y-oo 


where p(<^) is the distribution function of ^ , and is the simultaneous 

distribution of (pi and ^ . 

Applying equations (XXV) and (XXVI) it follows that the appropriate 
Critical Region is defined by the inequalities v' < v[ and v' > where 


and 


a = ai + a 2 




(XXVII) 


/(A'-l)/2 -Jr 


0 


(XXVIII) 


where p{v^) is the distribution function of e'. 

The use of the unbiassed Critical Region of Type B corresponds to adopting 
as a criterion 



Since v' derived from a sample of size N is distributed as v derived from a sample 
of size (N — 1), it follows that r' is equivalent to the r of equation (XIII) based 
on a sample of size (N — 1). Therefore Table I may also be used for testing 
the hypothesis that <r = o-o whatever be the population mean, by entering with 
the number of degrees of freedom, N — 1. 

In the example previously used, compute 

X = ^ = 0.517 
al 

From Figure 1, A: is approximately .51, corresponding to P = .0422. 
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r' is not the same as the maximum likelihood ratio X' (6). 

X' = I _ J^-NI2 ^tN 12 (XXX) 

jh^{E\cr^m) 

As N becomes infinite the distribution of X' is the same as that of the X of (XVI). 
For N = 49, the probabilities corresponding to X' agree with those using r' to 
within a unit in the third decimal. 

The X' test is biassed as may be seen in Figure 2 where we have plotted the 
power of the test based on the region w defined by v[ = 3.187, vt = 22.912 for 
which a = .0436 + .0064 = .0500, on the assumption that crl = 1.0, for N = 10. 
Although the criterion is biassed it is slightly more sensitive to alternatives 



Fig. 2. Comparison of Critical Regions for v'. Ho Specifies (r? ■* 1.0. N ^ 10. 

specif 3 dng < al than is the unbiassed Critical Region of Type B defined by 
Vi == 2.953, V 2 = 20.305, a = .0339 + .0161 = .0500. The criterion of con- 
stant distribution, p(v'), 

= c' (XXXI) 

has also been considered. In this case Vi = 1.903, V 2 = 17.391, a = .0071 + 
.0429 = .0500. This criterion is biassed for some alternatives specif 3 ring 
< or 0 f but its power curve lies above that of the unbiassed region for . 

Apparently the bias may be shifted at will by changing the exponent of v\ 
This may be desirable if greater weight is to be given to one class of alternatives. 
In fact decreasing the exponent of r' to 0 produces the Best Critical Region 
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for the class of alternatives specifying v><i\, and defined by Vi = 0, »* = 16.919 
for a => .0600. No region can be found giving greater power. On the other 
hand this region is insensitive to alternatives of the other class. Increasing the 
exponent indefinitely produces the Best Critical Region for the other class 
defined by = «o and»i =» 3.325 for a = .0500. 
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ON THE POLYNOMIALS RELATED TO THE DIFFERENTIAL EQUATION 

1 ^ Op + aix N 

y dx bo + bix + D 


By Frank S. Beale 

Introduction. In a previous issue of this Journal,^ E. H. Hildebrandt has 
established the existence of a general system of polynomials Pn(&, x) associated 
with the solutions of Pearson's Differential Equation 


(R) 


1 % _ ^ 
y dx^ D* 


N and D being polynomials in x of degrees not exceeding one and two respectively 
with no factor in common. 

It was shown that the polynomials Pn(k, x) = Pn themselves satisfy certain 
differential equations and a recurrence relation. The classical polynomials of 
Hermite, Legendre, Laguerre, and Jacobi are special t3rpes of Pn(Aj, x). Since 
the classical polynomials are employed rather extensively in statistical theory, 
certain of their properties are of special interest. 

It is the purpose of this paper to determine from Hildebrandt's general equa- 
tions some new properties of Pn(ft, x) and to apply these properties to the 
classical pol3momials. The paper consists of two parts. In part I some 
theorems are established concerning common zeros of D and Pn . In particular, 
a theorem is established to exhibit the conditions under which the zeros of Pn , 
which are not zeros of D, are simple. In part II a method is outlined for the 
classical polynomials by which one can determine the number and location of 
the real zeros in the various segments into which the zeros of D divide the x axis. 
The points of inflexion and the degree of the polynomials are also considered. 

A new feature of the method employed is, we believe, its being based upon the 
use of differential equations of first order, for most part, while other investi- 
gators^ have employed differential equations of second order. As to the results 
obtained, the author believes them to be partly new. They have points in 
common with the results of Fujiwara, Lawton and Webster. 


^ Systems of Polynomials Connected with the Charlier Expansions^ etc., Annals of Math. 
Stat., Vol. II, 1931, pp. 379-439. 

• M. Fujiwara: On the zeros of Jacobi’s Polynomials, Japanese Journal of Math., Vol. 2, 
1926, pp. 1, 2. 

W. Lawton: On the zeros of Certain Polynomials Related to Jacobi and Laguerre Poly- 
nomials, Bull. Am. Math. Soc., Vol. 38, 1932, pp. 442-449. 

M, S, WehBter: Thesis, Univ. of Penna. These results were kindly communicated to 
me by Dr. Webster. 
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1. Theorems Coticeraiiig Common Zeros of P^Qe, x) and D 
The following equations will be employed later: 

(1) Pn+iik, x) = [N + (k - n)D']Pn{k, x) + DP^k, x). 

(2) Pn+xik, *) = (n + 1) D"] P„(fc, x). 


Pn+i(k,x) = [AT + (fc - n)D']P„(k,x) 

i)"] i)P«-i(fc, x). 

These are not explicitly given in Hildebrandt's Paper but the method of obtain- 
ing them is outlined there in detail. 

We shall make use of the following lemma which we state without proof. 
Lemma (1). Let Pn{x) he a polynomial of degree n. If both Pn and Pn contain a 
factor (x ~ a)***, m < n, then Pn contains the factor (x — 

We also need an expression for Pi+i(fc, x). By repeatedly differentiating (2) 
and eliminating Pl(fc, x) we get, 


Pn+i(A:, a:) = It (n + 1 — i) N' + 

»“0 L 


2fc — n + i 


i(k, x), 


(n + 1)^ 


Theorem /i . If D is a perfect square, Z>' is not a factor of Pn+i {k, x), n ^ 
0,1,2,... 

Proof: Assume D' to be a factor of Pn+i . From (1), D' is either a factor of 
Pn or of + (A; — n) D\ But D' is not a factor of N + {k -- n) D' as this 
implies that D' is a factor of N contrary to hypothesis on (P) that D and N 
have no factor in common. Thus, D' is a factor of Pn , and by a repetition of the 
reasoning a factor finally of Pi , which as it was just pointed out, is impossible. 

Theorem I 2 . Set D ^ {ptix + Pi){aiX + ^ 2 ), D not a perfect square. If 
aiX + jSi , i = 1 or 2, is a factor of Pn , then {uiX + fiiY is a factor of Pn+q--i , 
g =s 1, 2, 3, ... 

Proof: From (1), a,x + being a factor of Pn and D, is also a factor of 
Pn+i . From (2), aiX + is a factor of P^+i . From Lemma (1) it follows 
that (oiX + fiiY is a factor of Pn+i . Continued repetition of the reasoning 
establishes the theorem. 

Corollary. If both aiX + ft and atx + ft are factors of Pn , then is a factor 
of Pn^-g-l . 

Theorem /s . Assume D of the same form as in Theorem I 2 . If oax + fit , 
i 1 or 2, is a factor of Pn+i and no higher power of aiX + fit is such a factor then 
aiX + fit is a factor 0 / AT + (A; — n)D\ 

Proof: From (1), aiX + fii being a factor of Pn+i and of D is also a factor of 
either iV + (fc — n)D' or of Pn . But a<x + /S* a factor of Pn requires, from h , 
that (piiX + fiif be a factor of Pn^i contrary to hypothesis. Thus, a,x + is a 
factor of AT + (fc — n)D\ 
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ao8 

CoroUary. If {aix + A)(«»* + A), {ai, on 9^ 0), t« a factor of P*+.i and no 
higher power of either aix + A or aj* + A is contained in P„+i then N -{■ (ft — n) 
D' = 0. For from 7 j , JV + (fc — n)D' contains (aix + A) (a*® + A) as a factor 
which implies N + {k — n)D', being linear, vanities identically. 

Theorem h . If («<* + j8<)* and no higher power of + ( 3 < is a factor of 
Pn+9-1 then OiX + A and no higher power of aa + 0{ is a factor of P„ . 

Proof: Let us write, 

(A) PiH-a-i = {uiX + / 3 ,)* 0„_i , tftn-i ^ a polynomial of degree < n — 1 which 
does not contain the factor «<a; + A • Taking the {q — 1)**' derivative of (A) 
by Leibnitz Theorem, we get, 

(B) + 

On setting q = q — 1 in ( 4 ) there results, 

(C) PiVA = n (n + ? - 1 - i) [jV' + 2fe-n-^g + t + 2 ^„jp^ 

From (B) we see that aiX + 0i ia & factor of PiV«-i • No higher power of 
aa + $i is such a factor. From (C) our theorem now follows. 

Corollary (1). Under the hypotheses of Theorem h, a{X pi is a factor of 
N + (k — n + 1 ) 0 '. This follows at once from It and It . 

CoroUary (2). If O* = {aiX + A)® (otsx + A)®, («i , a* 0), fs o factor of 
Pn+t-i and no higher powers of either aiX + A or atx + A ore factors, then N + 
{k — n + 1 ) 0 ' s 0 . For the linear expression N + {k — n + 1 ) 0 ' contains, 
from Corollary (1), the quadratic factor (aix + A) (atx + A)- 

The following lemma can be easily established and is given without proof. 

Lemma (2). Assume D of the same form as in Theorem h . Then there is only 
one value of s for which N + aO' contains a^t Pi as a factor. 

Theorem h . Assume D of the same form as in Theorem 7 * . If N + (k — n)D' 
contains atX + A , » = 1 or 2, as a factor, then Pn+i contains an + Pi and no 
higher power of an + Pi as a factor. 

Proof: From (1) we see that P»+i contains an + Pi at least to the first power 
as a factor. Again from ( 1 ), if P„+i contains a higher power of a<x + j8» as a 
factor, this means that both Pn and P« contain an + Pi at least to the first 
power as a factor and from Lemma (1) it follows that P» contains an + Pi at 
least to the second power as a factor. By corollary (1) from Theorem L it 
follows that an + jS* is a factor of JV + (fc — ni)D' for ni < n, contrary to Lemma 
(2). 

Theorem It ■ If an + A and an + Pt are factors of N + (k — ni)D' and 
N + (k — n»)D' respectively, (ai , aj 5^ 0), then P„ s 0 , a* > Wi + «* • 

Proof: From Theorems h and 7 j we see that (an + A)"* (an + A)"‘» of 
degree ni + ji* , is a factor of Pnj+n, , of degree n* + ni at most. Similarly, 
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(aiz + (aax + of degree n, + ni + 2, is a factor of P^+a,-n, of 

degree nj + ni + 1 at most. This implies P„,+„,+i s 0. Hence, « 0, 
M > «i + «*• In fact, (1) shows that P» s 0 implies P, s 0, v > /«• 

Theorem h . Assume D of the same form as in Theorem It . Then Pn^i » 0, 
P„ ^ 0, implies either N + {k — m)D' ^ 0, m < n, or there exist two values of 
m, (mi , m*), such that N + {k — mi)iy, JV + (fc — mi)D' contain as factors 
aiX + A a,nd otx + A respectively, (mi , mj < n). 

Proof: Setting P»« s 0 in (1) gives, 

(1“) [N + ik- n)D'] P„ + DP'n s 0. 

If Pn = const., 1® shows that N + (k — n)D' = 0 and our theorem is verified. 
Suppose P« ^ const. We get from (1®), 

_ [N + ik- n)D']P„ 
rn - ^ 

Thus, D is a factor of the numerator, and our theorem now follows from CoroUar 
ries (1) and (2) of Theorem h . 

Theorem I». If N + (k — m)D' ^ 0, m = 1 , 2, • • • n, and if N + (k — m)D' 
contains neither aiX + A , nor atx + A as factors, then Pn+i and D have no factors 
in common. This follows at once from Theorems h and It which constitute a 
necessary and sufficient condition that Pn and D have factors in common. 

Theorem h . If N ^ const, and if D is linear, all Pn are constants, n = 1, 2, 3, 
• • • . This follows directly from (2). 

Theorem ho . If N' + D" fii 0, m = 1, 2, • • • (n — 1), all zeros of Pn 

Jd 

which are not zeros of D are simple. 

Proof: Suppose Pn has a multiple zero x = a which is not a zero of D, Then 
(1) shows that a is a zero of Pn^i . From (2), a is a zero of P'n+i, From 
Lemma (1), a is at least a double zero of Pn+i . Furthermore, (3) shows that a 
being a double zero of Pn and of Pn+i is also a double zero of Pn-i • By a con- 
tinued application of (3), it follows that a is a double zero of Pi which is impos- 
sible since Pi is of degree < 1. 

II. Concerning the Zeros of Pn{k, x) 

The polynomials Pn(A^, x) are defined by Hildebrandt® as follows: Pnikj x) = 

1 

where w is a non-identically vanishing solution of the differential 

y ox" 
equation 

1 dy _ Op + Oix ^ N 
y dx ~ bo + bix + b%3? ~ D' 


> L.c. pp. 400-401. 
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I^e Jacobi Polynomials are defined as follows: 

Ux,a,fi) » - x^, «, 

real. It follows that J„ix, a, is a special type of P„(fc, x) with N * (_/3_a) 
X + a, D s a:(l — x), n = 1: + 1, whence, 


N' = -fi-a, D' = 1 - 2x, D" = -2; D(0) = D(l) = 0, 


P,(Jfc, x) ^N + kD' = 0 for 


_ a -I- ft 
~ a + /3 + 2ft’ 


Pi(ft, x) = _/3 - a - 2fc, 


In determining the number and location of the real zeros of the Jacobi Poly- 
nomials we employ the following notations: 


P<(ft, *) = 0 for a: = f = 1, 2, • • • ft + 1 ; ft = 0, 1, 2, • ■ • ; j = 1, 2, • • . t. 


0^ N'+ - a -2k + n, n = 1, 2, • • • ft, 

Zt 

M = [iV -h (ft - n)D']^ = a -f (ft - n), 

v^[N + {k-n) D']^i = _ (ft _ »). 

We proceed to determine the number of real zeros of the Jacobi Pol3momial8 
on the intervals (— <», 0), (0, 1), (1, oo) into which the zeros of D divide the 
X axis.* The proofs proceed by mathematical induction. We first determine 
the location of the real zeros of P«(ft, x), n = 1, 2, • • • ft + 1, by successive 
applications of (1) and (2). We then use the relation P*+i (ft, x) = Jt+i (x, a, P). 

Several cases concerning possible values of a and P should be considered. In 
order to bring out the method of procedure only two such cases will be fully 
discussed here. The results for other possible cases will be merely listed. 

Ai : a < 0, j8 < 0, 1 a 1 < \ P \ ,a, P, a + P not integers. 

Let ki be the greatest integer contained in a, 

P; 

“ ftj be the gilSatest integral value of ft for which a -I- |9 -|- 2ft < 0. Then 


0 < fti < fts < fts . 


* In the case a, fi > 0 these zeros all lie, as is known, on (0, 1). 
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Au : 0 < A: < . TTe then have $> 0, n < 0, v > 0, 0 < am < 1, Pi > 0. 

(1)* 4- (—1)* 

Then Jk+i(x, a, ff) has — — ^ — — zeros in 0, 1. These are the only real zeros. 

Proof: Consider first Pi(lc, x). Its only zero is at aj,jfe,i , where 0 < ai,*,! < 1. 
Purthennore, Pi > 0. Also Pi > 0 for x > ai.t.i and < 0 for x < ai.t,i . From 
(1) we see that Pj(fc, ai.it.i) > 0, (since Pi(fc, ai.t.i) = 0| •D(ai.t,i) > 0 and Pi > 0). 
From (2) it follows that Pi(Jc, x) < 0 for x < ai.i,i , Pi(k, aj .t,i) = 0, Pt(Jc, x) > 0 
for X > ai,t.i . These conclusions follow from remarks concerning the sign of 0, 
the fact that Pi{k, ai.t.i) = 0, and from remarks concerning the sign of Pi to the 
left and to the right of x = ai,*,! . Thus, P*(fc, x) > 0 for all real x and hence 
has no real zeros. By employing (2), it is now evident that P'zik, x) > 0. From 
(1) and remarks concerning n and v we see that Pz(k, 0) < 0 and P»(k, 1) > 0. 
Thus Ps{k, x) has a single real zero aj,*.i , 0 < aj.*,i < 1. The reasoning from 
Pz to Pi is analogous to that from Pi to P* . By continuing this procedure we 
finally conclude that P*+i(k, x), (s (*, a, p), has but one real zero, (in 0, 1), 
if k is even and no real zeros if k is odd. 


An: ki < k < kz. Set k = ki + q, q = 1, 2, ■ ■ • , kz — ki . Here 0 > 0, 
H>Q,n = 1,2, q - l,ix <0,n = q,q + l, ••• ,q + ki. v > 0, ai,k,i < 0, 
P{(k, x) > 0. Jt, + g + 1 (x, a, 0) has q distinct zeros in (— », 0) and 

(1)*‘ 4- 

— ^ ~ zeros in 0, 1. These are the only real zeros. 


Proof: First consider the sequence Pn(k, x) n = 1,2, • • • q, since the conditions 
on 0, y, and v do not change over this range of n. Now Pi(k, ai.t,i) = 0, ai,i,i < 
0. Furthermore since Pi > 0 we have Pi > 0 for x > ai.i,i and < 0 for x < 
«i.i,i . Pass now to Pz(k, x). Since D(ai,k,i) < 0 and Pi {k, ai,*,i) > 0, we see 
from (1) that Pj(k, ai,t,i) < 0. Moreover (2) shows Pj (k, ai,t,i) = 0, P^ (k, x) 
< 0 for X < ai.t.i and > 0 for x > ai.t.i . Thus Ps(fc, x) < 0 and a relative 
minimum at x = <* 1 . 1 , 1 . Since j P*(fc, ± » ) 1 = « , we see that Ptik, x) has two 
real zeros of which the left most, a 2 .i,i , is in (— «, 0). Again y > 0 together 
with (1) assures Pi(fc, 0) > 0. Thus as.u is in (ai.i.i , 0), hence in (— «, 0). 
By continuing this reasoning on the successive Pn{k, x), n = 1, 2, • • • q, we 
conclude that P«(k, x) has q zeros in — » , 0 and Pq{k, a«.t,i) < 0. 

Next, consider the sequence Pn(k, x), n = g + 1, 9 + 2, • • • g + fci + 1. 
Over this range of n we have > 0, /< < 0, »» > 0. From what has just been 
shown, Pq(ik, aq,k.i) = 0, — 00 < a,,i.i < 0, t = 1, 2, • • • g. Also P',{k, 
i — 1, 2, • • • g, is alternately negative and positive. Suppose g odd, (similar 
reasoning holds for g even). Thus, we suppose P'qik, Oq.k.i) < 0, Pqik, aq.k,q) < 
0, Pq(k, x) > 0 for X < a,.fc.i and < 0 for x > a,.*., . (1) shows Pq+i(k, a,.*.,), 

i = 1, 2, • • • g, to be alternately positive and negative. Thus, the zeros 
are separated by g — 1 zeros of Pq+i(k, x). Since from (1), P«+i(fc, <*«.i.i) > 0 
and from (2) Pq+i(.k, x) > 0 for x < <*,.1.1 , there exists a zero a,+i.t.i in (— «, 
<xq.k.i). Thus far, we have established the existence of g zeros of Pq+i(k, x) in 
00 , 0). g being odd, we have from (1), Pq+iik, > 0. Also from (2), 
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x) < 0 for « > otq,k,t • Again from (1) and assumptions regarding n and 
V it follows that 0) > 0, P*+i(A:, 1) < 0. Thus, P*+i(A:, x) has a zero 

««+i.».*+i in (0) !)• There being no extrema for P«+i(fc, x) other than the , 
i = 1, 2, • • • q, (as (2) shows), we have thus proved that P«.i(fc, x) has q 
distinct zeros in (— oo , 0) and a single zero in (0, 1). Reasoning similarly from 
x) to Pf^k(k, x) we establish the existence of q distinct zeros , 

t = 1, 2, • • • g, in (- «, 0) with in (- «, a*+i.i.i) and a,+s.t.i , i - 

2, 3, • • • g, separating , t = 1, 2, • • • g. From (1) we see that P«+*(fc, 

< 0 and Pt^(k, < 0. The only extrema of P,^{k, x), 

(as (2) shows), are located at , i = 1, 2, • • • g + 1. ^ain, by (2), 

Pi+*(A!, x) < 0 for X > ; hence there can be no real zeros of P,+j except 

the g zeros in (— », 0) already found. The reasoning from P^ft to P^ is 
similar to that from P, to P^i . Thus, P, 4 .*,+i = has g distinct zeros in 

(— 00 , 0), tc^ther with one zero in (0, 1) for ki even. For ki odd, there are g 
distinct zeros in (— «, 0) only. The results are the same whether g is odd or 
even. 

The results for the remaining sub-cases under case Ai are given in the table 
which follows. For comjdeteness, the results for cases An and An are included 
in the tabulation. A few words of explanation are necessary to clarify the 
conditions under which the various sub-cases in the table occur. Let 1 « 1 = 
ki + q, \ 0 I = kk + h, h, q < 1. If g -f h < 1, then | a -|- /3 | = fci -f- and we 
have either. 

Am : ki + ki even, 2ks = ki -j- kt ^ kt — ki = kt — kt . 

Am : ki + kt odd, 2 fc 3 = fci + fcj — 1 = fcs — = fc* — fca — 1. 

Again if 1 < g -t- A < 2, then |a-|-/9| = ki + kj + 1 and we have either, 

Am : fci -f ft* + 1 even, 2fcj = fci -|- + 1 = fcs — fci = fcj — fca + 1. 

Am : Jfci -f -f- 1 odd, 2kt = ki + h = kt — ki = kt — k» 

In eases Am and Am we assiune la-f/S|a=A:i-|-A;s + p, P<1, while in cases 
Am and Am ,|a-4-/S| = A:i-fA:*-t-p, 1 < p < 2. The complete results for 
case Ai follow. (See page 213.) 

A* : « < 0, |8 < 0, 1 a I < \0 \ , a, 0not integers, a + 0 = integer. Define ki , 
A^ , A;s as in Ai . Then 0 < ki < kt < kt. In Case Aji , d + « is odd while in 
Case A *2 , d + « is even. (See page 214.) 

Aa:a<0, d<0, « = —ki , integer, 0 noi an integer, | a ] < j d 1 • Define 
ki, kt, kt as in Ai. Then 0 < Aii < A^ < A:*. There are two sub-cases, An : the 
greatest integral value of a -I- d is odd. An : this integral value is even. (See 
page 215.) 

At : a < 0, 0 < 0, a not an integer, 0 = —ky, integer, | a { < 1 d 1 • Define 

, A* , A :3 as in Ai . Then 0 < A:i < Aia < ft* • There are two sub-cases. An : 
the integral part of a d is odd. An : this integrfd value is even. (See page 216). 
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Jki+k 2 +q+i ; ? * 2, 3, • • • ; Same zeros as in Aui for corresponding values of q. 
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A6:a<0, /3<0, |a|<|i9l,a=r — fcj integer^ = —h integer. Define 


ki fkifkt asm Ai. 

CaaeB Polynomiftl 

In cases A^i and A®* , a + jS is odd and even respectively. 

Range of Sub-Seript Zoroe in 



(-«,0) 

a? - 0 

(0,1) 

(D* + (- 1 )* 

2 ' 

Asm Asai; 


0 <k<kii 0; 

0; 

Asisi Ami 


5 - 0, 1, 2, •• •,*8 - h; q; 

A;i + 1 ; 

0 

Afiis; 

J fci+«+i 1 

1 

1 

•0 

1 

1 

.—1 

It 

Aji + 1; 

0 

Afiss; 

A 514 , A 584 ; 

A(15, Ami 

•f* j-Hr+i ; 

J fcj+ff+i * Oj 

* Oj 

g » 1, 2, • • A ;2 - Arg - 1; Ajs - fci - g + 1; 

g 0, 1, 2, * • • hi, 

g - 1, 2, 3, • • 


0 


If assumptions are identical with those of As except 1 a | = | /S j , then for 
0 < it < fci , the results agree with Asu and Jki+q-^i = 0, g = 0, 1, 2, • • • . 

As : a > 0, i3 < 0, I a I > \fi\yfinotan integer. Let fci be the largest integer 
in ff. 

Case Polynomial Range of Sub-Script 

( 0 , 1 ) 

Asi J 0 ^ fc fci 0 

Ass Jki+q+\ g=l, 2, 3, ••• g 


Zeros in 
(1, *) 

2 

( 1 )*> + (- 1 )*' 
2 


Ay : Same assumptions as in At except (3 = —ki, integer . 


Case 

Polynomial 

Range of Sub-Script 

(0, 1) 

Zeros in 

X = 1 

A 71 

Jk-hi 

0 < fc < Jti - 1 

0 

0 

At, 

J ki^q+l 

? = 0, 1, 2, . . • 

9 

fci + 1 


A8;a>0, /3<0, la| = |/3l. Ji - a and results for J„ , n > 1 are 
identical with those in A? and A» respectively according as jS is or is not an integer. 
A* : a > 0, |3 < 0, I a I < 1 ^ 1 ; /3, « + /3, nof integers. 

Let fci be the greatest integer in a + |8. 


it 

ti 



It 


It 

tt 


n 

tt 


for which a + j8 + 2fc < 0. 


it tt 
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Then 0 < < fci < Asi . 


Case 

Folsoionual 

Ranga of Sub-Soript 


Zeros in 




(-•0,0) 

(0, 1) 

(ii «) 

Aw; 


0 ^ jfc < ibg; 

fc + 1 ; 

0 ; 

0 

Am; 


j - 1, 2, • • fcieven; 

fc, — a + 1; 

0 ; 

0 

Am; 


5 - 1, 2, • • •, (lb, -1- 1); fci odd; 

fc, — a + 2; 

0 ; 

1 

Aw; 


a - 1, 2, •••, (A^ - *i); 

0 ; 

0 ; 

2 

Am; 


a - 1,2. 3 , • • •; 

0 ; 

9; 

(1)** + (--1)** 

2 


Aio : Same aaaumptiona as in At bui now | a | = | | . Then ki := kt = 0, 

Ji K a, and results for , n > 1 are the same as in An and An . 

All : Same assumptions as in At except = —kt, integer. 

Case Polynomial Range of Sub-Script Zeros in 

(-«,0) (0, 1) X = 1 (1, «) 

Aii.i Same as Ati 
All , 2 Same as An 
Aii.s Same as A«g 

Au.4 J*,+ 9 +i; g = 1, 2, 3, • • • ; 0; q; A ;2 + 1; 0 

Ai 2 : a > 0, jS < 0, \ a \ < \ \ , p not an integer, a -J- = odd integer. 

Define iki , A: 2 , A;s as in A» . 

Aij : Same assumptions as in Am except a /S = even integer. 


Cases 

Polynomial 

Range of Sub>Script 

Zeros in 

A 12 . 1 , Ais.i; 

Same as An 



(-C0, 0) 

Ai2,2 ; 1 

[Jtk,+t = const. > 0; 

?-d 

tl 

■■■ ,h; 

kt-q + 1 

Ai8,2 ; 1 

Ai2,8> Ai8,8; 

Ai2.4) Ai8,4 ; 

[«ft,+8+i; 

[.f 2*,+8 = const. > 0; 
Same as A,, 

Same as An 

g = l,2, . 

• ' , ki 1 ; 

kt-q + 2 
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Ai 4 : (Some assumptions as in An, except /3 = — fc* integer. Cases Ai44 , 
Ai4^ and Ai4,» have the same results as Ai*,i , Ai*,* , and Aii,j respectively. 
Ai 4,4 has the same results as Au, 4 . 

A« : Same assumptions as Au except /3 = — fc* , integer. Cases Au,i , A«.j , 
and Au,s have the same results as Au,! , Ai*,* , and Aw, 8 respectively. Ais .4 has 
the same results as An, 4 . 

An : a = 0, /S < 0, /3 — no< on integer. 

Let ki be the largest integer contained in /3. 

“ kt be the largest integer for which $ + 2k < 0. 


Case 

Polynomial 

Range of Sub-Script 


Zeros in 





(-«,0) 

(0, 1) 

(i» «) 

Ai6,i; 

Jm; 

0 < * ^ 

*; 1; 

0; 

0 




kt — q; 1; 

0; 

0; ki even 

Ai6,2; 


q ~ 1 , 2 , , ki - kzi 







" 1-9 + 1; 1; 

0; 

1; odd 






(1)*‘ + (-1)** 

Ai6,8; 


? “ 1, 2, 3, • • • ; 

0; 1; 

9-1; 

2 


Ait : a = 0, d = — fci — odd integer. Define h as in Aw . 
Ai* : a = 0, (9 = — fci — even integer. Define k^ as in Aw • 


Cases 

Ai7,i, Ai8,i; 

Polynomial 

Same as Aio.i 

Range of Bub-Boript 

(-», 0) 

Zeros in 

*-o ( 0 , 1 ) 

II 

Ai7,2; 


g - 1 , 2 , • • • , All - *8 - 1; 

ki-q; 

1; 

0; 

0 

Ai8,2; 

Ai7,8» Ai8,8; 

“ 0 

g =» 1, 2, • • • , ^8 + 1; 

fcs — 9 + 1; 

1; 

0; 

0 

Ai9 : a 

= 0, ^ = 0. 

g “ 1,2,3, •••; 

Ji s 0. 

0; 

1; 

9-1; 

"h 1 


Jk+i has fc — 1 zeros in (0, 1), 1 zero at x =0, 1 zero at * = 1, Is = 1, 2, 3, 

From the definition of a, /S) it is readily seen that «, /3) = (— 1)" 
Jn(l — X, |8, a). Thus, a transformation of a: to 1 — x interchanges a and 0. 
The interval (— cc, 0) is transformed into (1, <») and vice-versa. The points 
X = 0 and x = 1 are interchanged. Consequently, in all previous results we 
may interchange properly a and 0. 

In the foregoing results, the only real multiple zeros that can occur are at 
either x = 0 or x = 1. In the process of determining the degree of multiplicity 
of such zeros use was made of Theorem /* . 

Points of Inflexion. By taking (4), setting k = n, and replacing N' and D” 
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by their values for Jacobi polsmomiais, we get: P"+i(n, *) = (»+ 1) (n) 
[/8 + a + n]{/3 + a + n + ll a;). From definitions of Pn(k, x) and 

J nix, a, 0) we easily verify that, 

P*(n ± q,x) ss J„(x, a ± q + I, 0 ± q + 1), whence, 

Jnix, a, |8) = (n + 1) (n) I/S + « + n] [/3 + a + n + 1] Jn-i {x, a + 2, p + 2). 

We conclude that if neither a + /5 + » nor a + 0 + n + 1 vanishes, the points 
of inflexion of J«+i(x, a, 0) are at the zeros of odd order of a + 2, ^ + 2). 

The Degree of Jnix, a, 0). In analyzing the results of cases Ai to Ai* inclusive, 
it is noted that in some cases the number of real zeros of Jn is less than n. The 
question naturally arises whether the degree of J„ is n or less, for then we can 
determine the number of its imaginary zeros. The explicit expression of 
Jnix, a, 0) is known from which the degree of Jn can be found for various a and 
0. However, the degree of J„ can be found from (4). 

Since Jn^iix, a, 0) = Pn+iin, x), let us replace fc by n in (4) and at the same 
time replace N' and D" by their values for Jacobi Polynomials. Thus, we get: 

Jn+iix, a, 0) = n in + 1 - t)l-0 - a - n - t]P„_,+i(n, x), 

(5) 

n s= 0, 1, 2, • • • ; g = 0, 1, • • • , (n + 1). 


We may establish the following results. 

Cl) If a + j8 Ls not an integer, the degree of J„+i (x, a, /3) Ls n + 1, n = 0, 

1 , 2 , ... . 

In fact, in order for J$,+i to vanish, we see from (5) that either some factor 
— 0 — a — n — i vanishes or P„_,+i(n, x) vanishes identically. We first show 
that the latter is not possible. Now Pi(n, x) = N + nD ' s (— /3 — o — 2n) 
X + a + n 0 since + a is not an integer. Consequently, if P„(n, x) = 0, 
M>0, /i<n+l there will be a first value of ti, (m = v), for which Ppin, i) s 0 
but P,-iin, x) 0. By virtue of Theorem 7? this means that either N + 
(n — p)D ' s [— (8 — a — 2(n — p)] x + o + n — p s 0, p < v, or else there 
exist two values of p, (pi , pi), such that [— 0 — a — 2(n — pi)] x + a + w — Pi 
and [— — a — 2(n — p*)] x + a + n — p* are divisible by x and 1 — x 

respectively, pi , P 2 < i' — 1, pi Ps . Since, however, a + |8 is not an integer 
we see that, [— 0 — a — 2(n — p)]x + a + w — P^O, » and p being integers. 
This eliminates the first possibility that P^in, x) s 0, p < n + 1. Again, if, 
[— — a — 2(» — pi)] x + a + n — Pi is divisible by x, we have a + » — pi = 

0 or a an integer. For (a H-n — ps) — [|3 + a + 2(n — p*)] x s (a + n — p») 

j^l — xj to be divisible by 1 — xrequires/3 + n — 

Pi = 0 or /9, an integer, a and 0 are therefore both integers contrary to hypoth- 
esis! Thus, in (6), no polynomial P„_,+i(fc, x) s 0 and Jn^iix, a, 0) ^ 0. 
Replacing g by n -|- 1 in (6) leads to. 
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(*, a,0) *= n (n + 1 - t) [-/3 - a - n - *] Po(n, x), 

( 6 ) <“0 

n = 0, 1, 2, • • • . 

Thus 0, (since Ptin, *) = 1 and no factor — $ — a — n — i can vanish) 

and the degree of J»+i is precisely n 4* 1. From similar reasoning we prove: 
Cl) If a + /S > 0 the degree of /„+i is n + 1, n = 0, 1, 2, • • • . 

C|) If a + /3 = 0, then (I) Ji = a and (II) Jn+i is of degree n + 1, n = 1, 

2, 3, •... 

CO If a + j8 = —M — integer, M > 0, /S, a not integers, then, 

(I) For n < M, the degree of is min. (n + 1, Af — n). 

(II) n — M, Jr^i s const. 

(Ill) n > M, the degree of J„^i is n + 1. 

Cl) If a + /3 = — Af — integer, Af > 0, a, jS integers, a > 0, /3 < 0, then, 

(I) For n < M, the degree of Jn+i is min. (n + 1, Af — n). 

(II) n = M, Jn+i ® const. 

(Ill) n > M, the degree of J„+i is n + 1. 

C*) If a + )S = — Af — integer, Af > 0, a = — fcrinteger, /J = — X^i-integer, 

ki < ki then, 

(I) For n < ki , Jnn is of degree n + 1. 

(II) n > ki, Jn+i = 0. 

C?) If a + j3 = — Af — integer, Af > 0, a = j8 = — iki-integer, then, 

(I) For n < ki, Jn+i is of degree n + 1, 

(II) n> ki, J„+i = 0. 

The Laguerre Polynomials. These are defined as follows: 


Ln ^ L„ (x, a) = x‘-"e' [e"* n « 0, 1, 2, • • • ; 


a — real. We see that L„ is a special case of Pn(]k, x) with N ^ — x + a, 
D ^ X, n = k + 1. It follows that $ = — 1, = « + — «, aui = a + fc, 

and Pi(k, x) = 1. These can be used in determining the location of the real 
zeros of L„ , as was done for . The discussion here is somewhat simplified 
since Ln has but one parameter, a, and the x-axis is divided by the zeros of D(x) 
into two segments only, namely, ( — w , 0) and (0, » ). 

The following results are easily obtained. 

Bi : a > 0, Ln{x, a) has n distinct zeros in (0, w), n =» 1, 2, 3, • • • . This 
result is well known. 

Bs : a 0. I/in-i(x, a) has n distinct zeros in (0, <» ) and a simple zero at x = 0, 

H = 0, 1, 2, • • • . 

B| : a < 0, a, no< an integer. Let ki be the largest integer contained in a. 

(I) I»i+i(x, a) has 'tJ . z II zeros in (— «, 0), 0 < A: < fci , 



222 


VRAKK S. BBAIiE 


(II) Lt^+^(x, a) hius 9 distinct xeros in (0, ») and 


( 1 )“ + (- 1 )*‘ 


zeros m 


(- *, 0), 9 = 0 , 1 , 2 , ••• . 

B4 : a < 0, a = —ki — integer. 

(I) Lh+i{x, oi) has - — ~ -- zeros in (— <» , 0), 0 < ft < Aji . 


(II) Lk^j^i{x, a) has 9 distinct zeros in (0, <») and a zero of order fti + 1 at 
X XX 0, 9 = 0, 1, 2, • • • . 

The Degree of L„(x, a). We show first that here P/i(«, *) 0, m = 1, 2, • • • 

n + 1. By definition, Pi(n, x) m N + nD' s —x + o + n 0. Let us 
rewrite (2) for our present situation thus: 

(2®) P'„(,n, x) — — /iiP„_i(n, x). If, now, P„in, x) s 0, then from (2®) it follows 
that P^-i(n, x) s 0. C!ontinuing this reasoning, we finally arrive at a contra- 
diction, namely, Pi(n, x) « 0. If in (4) we set 9 = n -f 1 and replace N' and D" 
by their values we get: 


U*+r'(x, a) = (-l)"-'‘(n + 1)! Po(n, x) = (-l)"-^‘(n + 1)1 


Hence, L»+i is of degree n -|- 1. Note that this holds regardless of the value of 
a contrary to what was found for Jacobi Polynomials. 

Points of Inflexion. By a procedure analogous to that used for Jacobi Poly- 
nomials we can show that the points of inflexion of I/»+i(x, a) are located at the 
zeros of odd order of I/»_i(x, a -f 2). 

The Polynomials P*(0, x). If we set ft = 0 in (1), (2), and (3) we obtain the 
following relationships for Pn(0, x)‘ = P»(*) ^ Pn • 

(7) P^i(x) = [N- nD'] P„(x) -I- DPUx). 

(8 ) PUi(x) = (n -h 1 ) [n' - P„(x). 

(9) Ph.i(®) = [N - nD'] P„(x) -I- ^ D") DP._,(x). 

Theorems h to In inclusive, with ft = 0, hold for Pn(x). In addition, the 
following theorems hold for P„ . 

Theorem Hi . Suppose N linear and D(x) > 0 for aU x. Furthermore, let 
N' — — D" < 0, m — 1, 2, 3, • • • . Then Pn has n real, distinct zeros which 
separate the zeros of P»+i • 

Proof: Denote the zeros of Pn by an.< , t = 1, 2, • • • n, otn.i < an,.+i • Suppose 
iV' > 0, N being linear has a single zero au . Furthermore, since Pi ^ Ni, 
then Pi < 0 for X < an and > 0 for x > an . We pass now to Pi . From (7), 
we see that Pi(aii) > 0, (since D > 0 and PI > 0). Also (8) shows Pi(x) > 0 


* E. H. Hildebr&ndt, loc. cit. pp. 390. 
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for X < au and < 0 for * > au . This follows from what was noted concerning 

the sign of Pi for x > au and x < au, together with the hypothesis that N' — — 

D" < 0. Thus, there exists a zero of P* in (— ay) and a zero in («u , «) 
and our theorem holds forn 1. Assume that the theorem is true for n = A. 
The sequence Phiflh.i), * = 1, 2, • • • A, is alternately positive and negative. 
Since, from (8), the only extrema of P*+i are at a*.* ,i = 1,2, • • • A, we conclude 
that there are A — 1 zeros of Pk+i separating the a*,,- , i = 1, 2, • • • A. Since 
Phioth,!) > 0 we conclude that P* < 0 for * < a».i . This fact, combined with 
(8), shows Ph+i(x) > 0 for * < a*,! . Pk+i{ah,i) being positive, it follows that 
there exists a zero of P*+i in (— «, a»,i). Similar reasoning eltablishes the 
existence of a zero of P*+i in (a*,* , <»). Our theorem is thus established for 
N' > 0. The case N' < 0 can be similarly treated. 

Theorem Ht : IfD{x) > 0/or aU x, D" <0,N' D" < 0, iV' = 0, iV ^ 0, 

tAcn P„ , n = 2, 3, • • • , hoe n — 1 retd, distinct zeros which are separated by the 
zeros of Pn-i . 

Proof: Since Pi s N = const., we see from (7) that Pt is linear. The reason- 
ing of Theorem Hi applies where we now start with P* . 

Theorem Ht : If D(x) > 0 for all x, except x = P, where D has a dovhle zero and 

if N' ^ a, N' — ^ D” < Q, n ss 1, 2, 3, • • • , then P, has n real, distind zeros 

which separate those of P„+i . 

Proof: Theorem h with k = 0 assures us that Pn and D have no zeros in 
common. The proof now follows the line of reasoning of Theorem Hi . 

Theorem Hi : If D{x) > 0 for all x except x = where D has a double zero and 

ifN' =^0, N N' -‘^D" <0, m= 1,2,3, ■■■, then Pnhasn- I real, 

distinct zeros which separate those of P«+i , n = 1, 2, 3, • • • . This theorem follows 
from Hi as did Ht from Hi . 

Points of Inflexion. Setting A = 0 in (4) leads to, 

P'Ux = (n + 1) (n) [aT' - I D"] [jV' - ^ P"] P»_i . 

This shows, under the assumptions of Theorems Hi to Hi inclusive, that the 
points of inflexion of Pn+i are at the zeros of P«_i . 

Hermite Polynomials. Theorem Hi and statement immediately above con- 
cerning points of inflexion apply directly to Hermite Polynomials where W s — ® 
and D s (t*. 


Lbrioh Univbbsity. 



THE SIMULTANEOUS COMPUTATION OF GROUPS OF REGRESSION 
EQUATIONS AND ASSOCIATED MULTIPLE CORRELATION 

COEFFICIENTS 

Bt PAm< S. Dwybb 

1. Introduction. The need sometimes arises for the prediction of a number of 
different variables from a given group of so-called fundamental variables. In 
the work of college prediction, for example, one might desire regression equations 
predicting certain measures of college achievement (e.g., first semester average, 
first semester English grade, first semester mathematics grade, number of hours 
of A received during first semester, etc.) on the basis of a number of other factors 
(e.g., high school record, score on American Council on Education Psychological 
Examination, score on some standard English achievement test, score on some 
standard mathematics achievement test, etc.). It is the purpose of this paper 
to show how the regression coefilcients and the associated multiple correlation 
coefficients can be obtained simultaneously. The essence of the method is a 
simple device by which one solution of general normal equations may be made to 
serve for all cases. 

2. The nonnal equations. Let zi, xt , Zi ,••• Xn , he the so-called funda- 
mental variables and let Xt be the predicted variable. The normal equations 
are computed by standard methods which result in one of the three types. 

Type I. Normal equations for determining h ,bi ,bt ,h , ■ ■ ■ , b„ . 


hifli *1- biSxi -|- bi^Xt "I- bsSxj -|- -f- bn^Xn — 2a;* * 0 

bo2xi -f- biSxi -|- btXxiXi bt^XiXj -1- -j- bn^XiXn — 2iiX* — 0 

btSXt -f- biSXiXi -|- "1- bi^XiXt -1- -|- bn^XtZn — ^XtXk =* 0 

boSXn + 6l2x»Xi -|- btSX„Xt -)- btZXnXa H- bn^x\ — 2XnX* = 0 

Tire II. Normal equations for determining 6* , 6* , bj , • • • , . 

£i = Xi — Mxi 

biZx\ -b bi^£iit "b b»2xiXs -b "b bnS^i^n — Siii* *= 0 

bi2£i£i -b bj2ij -b btSxjZs -b “b bn2i|X„ — 2xix* — 0 

bi2£»Xi -b bt^in^ -b bjXxnXj -b + bn2£l — 2i„it = 0 
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Type III. Normal equations for determining ft , ^2 , 183 , • * • , / 9 n . 

ft + ^1302 + risjSa + + rinfin — ru = 0 

Ttlfil + ft + ^ 23^8 + + r 2 n 0 n — TiJfe = 0 

TnlPl + ^n2ft + rndfiz + + ?*nni8n “ Tnifc = 0 

The three types are special cases of the general 

dnyi + ^12^2 + dizyi + + fty^y + + dinyn — du == 0 

+ ^22^2 + ftsl/a + + ftjl/y + + ftn^n — ft* = 0 

ftl2/l + d82t/2 + dzzyz + + fty^j + + ftn2/n -- dzk == 0 


diiyi + di2y2 + disyz + + ft/^y + + dinyn — d»* = 0 


dniyi + dnsyz + dnzyz + + dnyl/y + + dnnyn — ft* = 0 

where y, are the regression coeiEcients and d,y = dy,- . 

The methods described in this paper are applicable to the general case and 
hence to each of the three particular types. 

In examining the normal equations, it is noticed that the first n terms of each 
equation are completely determined by the n fundamental variables. The 
equations, aside from the last terms, are identical no matter what variable is 
predicted. It is only necessary to devise a technique for separating the con- 
tributions of the dik terms. 

3 . Solution by determinants. One method utilizes determinants. The 
value yj is expressed in terms of a determinant involving a column with entries 
d\jky dzk, dzky • • • , dn*. The determinant is expanded in terms of this column. 

Specifically, let D be the determinant of the coefficients of the y, and let Dyy 
be the cofactor of any element d,-, of Z>. Then 

D = Z 
•—1 

and 

^ {Dll dik + D21 d^k + Dzi ds* + . . . . + Dji dy* + . . . . + Dni dnk •) 

^2 = ^ {D \2 dik + D22 dik + Dzz ds* + . . . . + Dfi dy* + , . . . -b DnZ dnk .) 

Vi = ^ {Du dik + Du dik + Dzi dzk +....+ Du dy* + . . . . + Dni dnk •) 

yn = ^ {Dlndik + Dindu -h Dzndzk +...,+ Djndjk + • • « + Dnndnk-) 
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It is only necessary to compute ^ to find the coefficient of in the expansion 
of yt. 

An illustration is given. The normal equations are 

Pi + .3300 Pt "I" .2100 Pt — rj* = 0 
.3300 Pi Pt — .4800 Pt — r*i = 0 
.2100 Pi — .4800 Pt "i" Pt — fs* = 0 

from which at once 

/3x = i (.7696 ru - .4308 r* - .3684 r^) 

— .4308 ru .9559 rjt -|- .5493 ru) 

-.3684ru + .5493 r** + .8911 r,*) 

and also 

D = .550072 = (1.00)(.7696) + (.33)(-.4308) + (.21)(-.3684) 

= (.33)(-.4308) + ( 1.00)(.9559) + (-.48)(.5493) 

= (.21)(-.3684) + (-.48)(.5493) + ( 1.00)(.8911) 

so that 

Pi = 1.3991 ru — .7832 rj* — .6697 rti . 

Pt = -.7832 ru + 1.7378 ru + .9986 r,* . 

Pt = -.6697 ru + .9986 ru + 1.6200 r,* . 

It is only necessary to insert any given values ru , rj* , r** , to obtain the coeffi- 
cients of any specific regression equation. 

4. Solutions without determinants. Theoretically the solution by deter- 
minants is excellent but as the number of variables increases the work of com- 
puting the ra* cofactors ^or the different cofactorsj becomes enormous. 

We desire a technique for separating the contributions of the last terms when 
determinanls are not used. This can be accomplished by using a separate 
column for each da ■ Before algebraic manipulation, the value da is factored 
from the column and, after manipulative solution is complete, the multiplication 
by da is carried out. 


u 


Pt = i( 


Pt 





SIMULTANEOUS COMPUTATION OP REGRESSION EQUATIONS 


227 


As an example consider the normal equations 

Pi + = 0 

^1021 + /Sa — rgit = 0 

where == rji = .3300. Then the normal equations may be represented by 
rows (1) and (2) of Table I. 


TABLE I 


Row 

Operation 

/5i 


Tik 

Tik 

(1) 


1.0000 

.3300 

-1.0000 


(2) 


.3300 

1.0000 


-1.0000 

(3) 

— .3300 times (2) 

- .1089 

- .3300 


.3300 

(4) 

(1) + (3) 

.8911 


-1.0000 

.3300 

(5) 

— (4) divided by .8911 

-1.0000 


1.1222 

- .3703 

(6) 

— .3300 times (5) 

.3300 


- .3703 

.1222 

(7) 

- (2) + (6) 


-1.0000 

- .3703 

1.1222 


The four decimal place solution, whose steps are indicated by (3) (4) (5) (6)(7), 
is from (5) and (7) 

Pi = 1.1222 Tik — .3703 r2k 
= -.3703 ru + 1.1222 

This device may be combined with most of the standard methods of solving 
normal equations. 

5. Combination with Doolittle method. Especially to be recommended is a 
combination of this device with the Doolittle method which is recognized as a 
most cflScient method of solving normal equations in from five to ten Variables 
[1] [2]. One of the advantages of the Doolittle method is that related multiple 
regression coefficients may be obtained from the same forward solution, though 
additional back solutions are necessary [3]. 

The problem which led to the development of this technique was the simul- 
taneous prediction of scores on various occupations covered by the Strong 
Vocational Interest Blank from the scores on a few fundamental occupations, 
A multiple factor analysis revealed that five basic factors account for most of the 
scores. Five occupational scores, serving as approximations to the five basic 
factors, were used as the fundamental variables and the other scores were 
predicted from them. 

As an illustration of this prediction technique combined with the Doolittle 
method, I have selected three test scores as fundamental since the solution based 
on them shows all the steps of the Doolittle method and is shorter than the five 
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variable problem. Actually, solution by determinants (section 3) is advised 
for problems involving three variables. The steps of the Doolittle solution are 
presented in Table 11. The results should be compared with those of the 
determinai^t solution of section 3. 

The first column indicates the row and the second the description of the 
algebraic operation. The next three columns are the standard columns of a 
Doolittle presentation with the conventional elimination of the lower left entries. 
The next three columns carry through the Doolittle method with the values 
Ta , ru, ru kept in separate columns. The last coliunn is an adaptation of the 
conventional summary check column of the Doolittle solution. 


TABLE II 

Generalited Doolittle Preeentation 


Row 

Operation 


fit 

fit 


rik 

rtk 

S 

(1) 


1.0000 

.3300 

.2100 

-1.0000 



.6400 

(2) 


.3300 

1.0000 

-.4800 


-1.0000 


-.1600 

(3) 


.2100 

-.4800 

1.0000 



-1.0000 

-.2700 

(4) 

Repeat (1) 

1.0000 

.3300 

.2100 

-1.0000 



.6400 

(6) 

Negative of (4) 

-1.0000 

-.3300 

-.2100 

1.0000 



-.6400 

(6) 

Repeat (2) 


1.0000 

-.4800 


-1.0000 


-.1600 

(7) 

— .3300 times (4) 


-.1089 

-.0693 

.3300 



-.1782 

(8) 

(6) + (7) 


.891li 

-.6493 

.3300 

-1.0000 


-.3282 

(9) 

— (8) divided by 
.8911 


-1.0000 

.6164 

-.3703 

1.1222 


.3683 

(10) 

Repeat (3) 



1.0000 



-1.0000 

-.27 

(11) 

— .2100 times (4) 



-.0441 

.2100 



-.1134 

(12) 

.6164 times (8) 



-.3386 

.2034 

-.6164 


-.2023 

(13) 

(10) + (11) + (12) 



.6173 

.4134 

- .6164 

-1.0000 

-.6867 

(14) 

— (13) divided by 



-1.0000 

-.6697 

.9986 

1.6200 

.9488 


.6173 








(16) 

.6164 times (14) 



-.6164 

-.4128 

.6166 

.9986 

.6848 

(16) 

(9) + (16) 




- .7831 

1.7377 

.9986 

.9631 

(17) 

— .2100 times (14) 



.2100 

.1406 

-M97 

-.3408 

-.1992 

(18) 

- .3000 times (16) 


.3300 


.2684 

- .6764 

-.3895 

-.3146 

(19) 

(6) + (17) + (18) 

-1.0000 



1.3990 

-.7831 

-.6697 

-1.0637 


The general solution is read from rows (19) (16) (14) and is 
di * 1.3990 Tile — .7831 Tik — .6697 Tzi . 
ft * -.7831 ru + 1.7377 + .9986 . 

ft = — .6697 Tik 4* .9986 r** + 1.6200 fs* • 
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which agrees, aside from the last place, with the result of the solution by de- 
terminants. 

It is wise to check in the original equations (1), (2), (3) as soon as any 0i is 
found. Row (14), for example, should be checked by showing 

(-.6697)(1.00) + (.9985)( .33) -f (1.6200)( .21)= .0000 

(-.6697)( .33) -I- (.9985)( 1.00) + (1.6200) (-.48) = -.0001 

(-.6697)( .21) + (.9985)(-.48) + (1.6200)( 1.00) = 1.0001 

The same should be done with row (16) as soon as it is computed. Row (19) 
should be treated similarly. 

6. Many regression equations. If large numbers of regression equations are 
to be generated (the Strong Vocational Interest Study had 29 dependent va- 
riables), the following technique is suggested. Make a table with columns 
ru , rst , etc. and use the rows to indicate the different values of k. On another 
slip of paper insert the general values /3i , /3i , jSi , • • • jSn in successive rows so 
that a folding of the paper will bring any general /3 expansion in conjunction 
with the r’s of any test, k. The scheme is illustrated in Table III. 


TABLE III 


No. 

Oeoupation 

rUt i 

j 

ffk 

rit 


0ik 

0tk 

0»k 


r 

1 

Teacher 

1.00 

.33 

.21 


1.00 

.00 

.00 


1.00 

2 

Physicist 

.33 

1.00 

-.48 


.00 

1.00 

.00 


1.00 

3 

Office Worker 

.21 

-.48 

1.00 


.00 

.00 

1.00 

1 

1.00 

4 

Doctor 

.17 

.79 

-.52 


-.03 

.72 

-.17! 


.81 


Lawyer 

-.02 

,16 

-.59 


.24 

-.30 

-.78 


.64 

f 6 

Engineer 

.16 

.78 

-.02 


-.37 

1.21 

.64 


.93 







T 






01 

1.3990 

-.7831 

-.6697 


1.0000 

T ■ 






-.7831 

1.7377 

j 

.9986 

I 



1.0000 

T 




0> 

-.6697 

.9985 

1.6200 




1.0000 



10 

Mathematician 

.46 

.96 

-.49 


.19 

.82 

-.14 


.97 


etc. 











Thus, for the occupation of Engineer, 

Px - 1.3990 (.16) + (-.7831)(.78) -t- (-.6697) (-.02) = -.37 

A = -.7831 (.16) + ( 1.7377)(.78) + ( .9996)(-.02) = 1.21 

j8, = -.6697 (.16) + ( .9986)(.78) + ( 1.6200)(-.02) = .64 
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The value of the multiple correlation coefficient is then computed from the 
formula 

Vk .158 n = a / + Pvo^Vc + • • . . + fink^nk 

In the illustration above 

r*.i53 « \/(-.37)(.16) + (1.21)(.78) + C64)(-.02) 

= .93 

7. Regression equations by deletion. The method of getting related regres- 
sion coefficients and correlation coefficients, described by Kurtz [3], is also 
applicable. Again, a problem involving more than three variables is needed to 
show the real value of the scheme but the technique may be illustrated in the 
three variable case. We wish to find, from the forward solution of Table II, 
the regression equation and the multiple correlation coefficient when the first two 
fundamental variables only are used. We delete all columns involving test 3 
and complete the back solution as indicated in Table IV, wliich may be viewed 
as a substitute for the last ten rows of Table II. 


TABLE IV 
(See Table II) 


Row 

Operatioa 

fii 

fii 

fit rik rtk 

(20) 

Repeat (9) 


-1.0000 

-.37031 1.1222! 

(21) 

— .3300 times (20) 


.3300 

.1222 -.37031 

(22) 

(6) + (21) 

-1.0000 


1.1222 -.3703 


The results are 

Pi = 1.1222 rik -.3703 r2k . 
ft = -.3703 ru + 1.1222 
and these agree with the results of section 4. 

8. The simplified back solution. In every case in which the p*s have been 
given in terms of r's the matrix of the coefficients is sjonmetric (sections 3, 4, 5, 7). 
One wonders if this symmetry is generally true and if it holds for normal equa- 
tions of Type I or Type II. 

Determinants are much more useful in establishing general properties, such 
as the one under discussion, than they are in computing the values of regression 
coefficients in the case of a problem involving many variables. We return to the 
determinant notation of section 3. 

In each of the three types, and hence in the general case da = so that D is a 
symmetric determinant, = Dji and ~ ^ . Hence the matrix of the 

coefficients of the solution is symmetric. 
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This result may benised (1) to check the expanded results or (2) to elimmste 
some of the work of the back solution. The n coefficients must be recorded for 
|9n after which the column indicated by r„* may be dropped. The first n — 1 
coefficients must be computed for after which the column indicated by 
may be dropped, etc. The italicized entries in Table II are the ones 
which are eliminated in this way. The remaining coefficients are sufficient to 
completely determine the symmetric matrix. 

The summary right hand check column can not be readily used in the simpli- 
fied back solution but it is hardly to be recommended an 3 rway. Kurtz [3] 
argues against it on the ground that it is not necessary. The essential check is 
to see that each |9 solution satisfies all of the original equations. 

9. Conclusion. This paper provides a technique for the computation of 
general regression equations and shows how the technique may be combined 
with the Doolittle method in providing a practical means of mass prediction. 

University of Michigan. 
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CONSTITUTION 

ABTICIiS I 

NAME AND PURPOSE 

1. This organization shall be known as the Institute of Mathematical Sta- 
tistics. 

2. Its object shall be to promote the interests of mathematical statistics. 

Article II 
MEMBERSHIP 

1. The membership of the Institute shall consist of Members, Fellows, 
Honorary Members, and Sustaining Members. 

2. Fellows shall be the only voting members of the Institute. 

Article III 

OFFICERS, BOARD OF DIRECTORS, COMMITTEE ON MEMBERSHIP, 
AND COMMITTEE ON PUBLICATIONS 

1. The OflBcers of the Institute shall be a President, two Vice-Presidents, 
and a Secretary-Treasurer, elected for a term of one year by a majority ballot 
at the annual meeting of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority 
vote of the individuals present at the organization meeting, and shall serve until 
December 31, 1936. 

2. The Board of Directors of the Institute shall consist of the Officers and 
the previous President. 

3. The Institute shall have a Committee on Membership composed of three 
Fellows. At their first meeting subsequent to the adoption of this Constitution, 
the Board of Directors shall elect three members as Fellows to serve as the 
Committee on Membership, one member of the Committee for a term of one 
year, another for a term of two years, and another for a term of three years. 
Thereafter the Board of Directors shall elect from among the Fellows one 
member annually at their first meeting after their election for a term of three 
years. The president shall designate one of the Vice-Presidents as Chairman 
of this Committee. 

4. The Institute shall have a Committee on Publications composed of three 
Members or Fellows elected by the Board of Directors. The President shall 
designate a Vice-President as Ex Officio Chairman of this Committee. 
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Article IV 
MEETINGS 

1. A meeting for the presentation and discussion of papers, for the election of 
Officers, and for the transaction of other business of the Institute shall be held 
annually at such time as the Board of Directors may designate. Additional 
meetings may be called from time to time by the Board of Directors and shall be 
called at any time by the President upon written request from ten Fellows. 
Notice of the time and place of meeting shall be given to the membership by the 
Secretary-Treasurer at least thirty days prior to the date set for the meeting. 
All meetings except executive sessions shall be open to the public. Only 
papers accepted by a Program Committee appointed by the President may be 
presented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their 
election and again immediately before the expiration of their term. Other 
meetings of the Board may be held from time to time at the call of the President 
or any two members of the Board. Notice of each meeting of the Board, other 
than the two regular meetings, together with a statement of the business to be 
brought before the meeting, must be given to the members of the Board by the 
Secretary-Treasurer at least five days prior to the date set therefor. Should 
other business be passed upon, any member of the Board shall have the right to 
reopen the question at the next meeting. 

3. The Committee on Membership shall hold a meeting immediately after the 
annual meeting of the Institute. Further meetings of the Committee may be 
held from time to time at the call of the Chairman or any member of the Com- 
mittee provided notice of such call and the purpose of the meeting is given to 
the members of the Committee by the Secretary-Treasurer at least five days 
before the date set therefor. Should other business be passed upon, any 
member of the Committee shall have the right to reopen the question at the 
next meeting. 

4. At a regularly convened meeting of the Board of Directors, three members 
shall constitute a quorum. At a regularly convened meeting of the Committee 
on Membership, two members shall constitute a quorum. 

Article V 
PUBLICATIONS 

1. In the beginning, the “Annals of Mathematical Statistics'' shall serve as 
the official journal for the Institute. Other publications may be originated 
by the Board of Directors as occasion arises. 

Article VI 

EXPULSION OR SUSPENSION 

1. Except for non-pajrment of dues, no one shall be expelled or suspended 
except by action of the Board of Directors with not more than one negative vote. 
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Abticle VII 
AMENDMENTS 

1. This constitution may be amended by an aflSrmative two-thirds vote at 
any regularly convened meeting of the Institute provided notice of such proposed 
amendment shall have been sent to each Fellow by the Secretary-Treasurer at 
least thirty days before the date of the meeting at which the proposal is to be 
acted upon. Voting ixiay be in person or by mail. 

BY-LAWS 

Abticle I 

DUTIES OF THE OFFICERS, BOARD OF DIRECTORS, COMMITTEE 
ON MEMBERSHIP, AND COMMITTEE ON PUBLICATIONS 

1. The President, or in his absence, one of the Vice-Presidents, or in the 
absence of the President and both Vice-Presidents, a Fellow selected by vote 
of the Fellows present, shall preside at the meetings of the Institute and of the 
Board of Directors. At meetings of the Institute, the presiding officer shall 
vote only in the case of a tie, but at meetings of the Board of Directors he may 
vote in all cases. At least three months before the date of the annual meeting, 
the President shall appoint a Nominating Committee of three members. It 
shall be the duty of the Nominating Committee to make nominations for 
Officers to be elected at the annual meeting and the Secretary-Treasurer shall 
notify all Fellows at least thirty days before the annual meeting. Additional 
nominations may be submitted in writing, if signed by at least ten Fellows of 
the Institute, up to the time of the meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the 
proceedings at the meetings of the Institute and of the Board of Directors, 
send out calls for said meetings and, with the approval of the President and the 
Board, carry on the correspondence of the Institute. Subject to the direction 
of the Board, he shall have charge of the archives and other tangible and 
intangible property of the Institute. He shall send out calls for annual dues and 
acknowledge receipt of same; pay all bills approved by the President for expendi- 
tures authorized by the Board or the Institute; keep a detailed account of all 
receipts and expenditures, prepare a financial statement at the end of each year 
and present an abstract of the same at the annual meeting of the Institute after 
it has been audited by a Member or Fellow of the Institute appointed by the 
President as Auditor. The Auditor shall report to the President. 

3. The Board of Directors shall have charge of the funds and of the affairs 
of the Institute, with the exception of those affairs specifically assigned to the 
President or to the Committee on Membership. The Board shall have au- 
thority to fill all vacancies ad interim, occurring among the Officers, Board of 
Directors, or in any of the Committees. The Board may appoint such other 
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committees as may be required from time to time to carry on the affairs of the 
Institute. 

4. The Committee on Membership shall prepare and make available through 
the Secretary-Treasurer an announcement indicating the qualifications requisite 
for the different grades of membership. 

5. The Committee on Publications, under the general supervision of the 
Board of Directors, shall have charge of all matters connected with the publica- 
tions of the Institute, and of all books, pamphlets, manuscripts and other 
literary or scientific material collected by the Institute. Once a year this 
Committee shall cause to be printed in the Official Journal the Constitution 
and By-Laws and a classified list of all the Members and Fellows of the Institute. 

Article II 
DUES 

1. Members shall pay five dollars at the time of admission to membership 
and shall receive the full current volume of the Official Journal. Thereafter, 
Members shall pay five dollars annual dues. The annual dues of Fellows shall 
be five dollars. The annual dues of Sustaining Members shall be fifty dollars. 
Honorary Members shall be exempt from all dues. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow or Member include a subscription to the 
Official Journal. The annual dues of a Sustaining Member include two sub- 
scriptions to the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone 
whose dues may be six months in arrears, and to accompany such notice by a 
copy of this Article. If such person fail to pay such dues within three months 
from the date of mailing such notice, the Secretary-Treasurer shall report the 
delinquent one to the Board of Directors, by whom the person’s name may be 
stricken from the rolls and all privileges of membership withdrawn. Such 
person may, however, be re-instated by the Board of Directors upon payment 
of the arrears of dues. 


Article III 
SALARIES 

1. The Institute shall not pay a salary to any Officer, Director, or member of 
any committee. 


Article IV 
AMENDMENTS 

1. These By-Laws may be amended in the same manner as the Constitution 
or by a majority vote at any regularly convened meeting of the Institute, if the 
proposed amendment has been previously approved by the Board of Directors. 
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